Method and apparatus for extracting cluster shape features from digital images

ABSTRACT

A method and apparatus for extracting cluster shape features in two and three-dimensional images in a single scan is provided. The images, consisting of one or more classes, may be unbounded in one of their dimensions. As an image is scanned, each cluster fragment is assigned with one or more cluster labels. These labels are used to merge the cluster fragments into larger cluster fragments. An enhanced Hoshen-Kopelman algorithm is employed to determine the cluster shape features for the merged cluster fragments. Cluster label reuse is employed to enable the processing of substantially large images including unbounded images. After completing the scanning of an image section, cluster shape features data are outputted for completed clusters that extended into the section previous to said section. Optionally, cluster shape features of cluster fragments and completed clusters that extend into the said section can also be outputted.

FIELD OF THE INVENTION

The present invention relates to computer-based cluster analysis andimage processing apparatus and method.

BACKGROUND OF THE INVENTION

The present invention is an improvement in computational processing fordetermining percolation and cluster distribution. An introduction topercolation and cluster theory is presented by Stauffer and Aharony,Introduction to Percolation Theory (Revised 2d ed. 1994). A "cluster"may be thought of in one sense as a group of features that neighbor oneanother within a universe, e.g. a lattice. A lattice is a specialspatial graph where all the nodes in the graph obey translationalsymmetry. This means that every node in the lattice has the samesurrounding as every other node. In lattice terminology, vertices areknown as sites. As an example of a lattice, consider a large array ofsquares: a given square in the middle of the array has four "nearestneighbors" which each share one side in common with the given square. Ifa feature is present at a given site and present also at a nearestneighbor site, the two sites are said to be in a cluster. The clusterdefinition does not have to be restricted to nearest neighbors only. Itcould include next nearest and other neighbors, also. A cluster mayextend throughout the lattice, and it can extend in many dimensionscorresponding to the dimensionality of the array. Clusters described inthe present invention are also known as connected components and shouldnot be confused with feature space clustering.

Computers lend themselves to cluster analysis. Also, computer-basedsimulations are often preferable to live experimentation. For example,cluster theory may be useful in determining the consequences of a treethat begins to burn in a forest. If the question is whether a singleburning tree will lead to the destruction of the entire forest, and atwhat rate, depending on how densely packed the other trees are, it wouldbe better from a societal perspective to perform the analysis via acomputer simulation rather than by physical experimentation calling forthe possible consumption of many forests. Hence, computer analysis isfavored. However, as a practical matter, however, computers almost neverhave sufficient memory space to store the data for an entire lattice orarray of sites. It is generally not practical to attempt to load allsites (data points) into memory and analyze the entire universe of dataat once.

The Basic Bounded HK ("BBHK") Algorithm

In 1976, my Hoshen-Kopelman (HK) algorithm was published to allow thesimulation and study of large lattices without having to store theentire lattice in computer memory. J. Hoshen and R. Kopelman,"Percolation and Cluster Distribution. I. Cluster Multiple LabelingTechnique and Critical Concentration Algorithm," Physical Review B, Vol.14, No. 8 (Oct. 15, 1976), which is incorporated by reference. The "HKAlgorithm" presented there is frequently cited in literature and used inindustry. E.g., Constintin, Berry, and Vanderzanden, "Parallelization ofthe Hoshen-Kopelman Algorithm Using a Finite State Machine," J. ofSupercomputer Applications and High Performance Computing (submitted in1996).

The original HK Algorithm is denoted herein as the Basic Bounded HK(BBHK) algorithm because it was designed to perform cluster analysis onbounded, fixed size lattices in two (2-D) and three (3-D) dimensions. Itcan also be used for higher dimension lattices and general graphs. TheBBHK algorithm was designed to analyze clusters comprising of one typeof site. Such types are also referred in the literature as classes. Ifthe BBHK algorithm has to analyze M classes, where M is greater thanone, it can do so by performing M passes through the lattice. Pixelimages can be viewed as 2-D lattices. Voxels (volume elements) are threedimensional (3-D) lattices. The BBHK algorithm examines data from abounded lattice in two or three dimensions. For example, a given latticemay include different types of sites A, B, C, etc. which could representanything--a tree or no tree, an oil particle or no oil, gray level dataor white level data, color A or color B or color C, a topographicfeature, etc. The BBHK algorithm would provide a distribution ofclusters by cluster size for one class only. Two sites would be found tobelong to the same cluster if they are the same class and are neighbors.

Comprehensive descriptions and implementations of the BBHK algorithm aregiven by Stauffer and Aharony, Introduction to Percolation Theory,supra, and R. J. Gaylord, P. R. Wellin, Computer Simulations withMathematica. (Springer-Verlag, 1995).

The following is a simple example of how the BBHK algorithm works.Consider the pixel data of FIG. 1(a), to be analyzed. It will beappreciated that this collection of data represents an array 100A thathas been obtained illustratively by scanning an image area using anytype of scanning device, including but not limited to a televisioncamera, optical data digitizer, self-scanned array, arrangement of lightemitting diodes, or the like, or in other fashions, and obtaining graylevel data. This could come from a pattern recognition system. However,the pixels or array elements need not have come fromvisually-discernable features. Rather, any features suffice--be theymagnetic, gravitational, electrical, nuclear, optical, or otherwise.Illustratively, data for the entire six by six array has been stored andis available for examination.

In FIG. 1(a), white and gray pixels are represented. The gray pixels 102are represented with the letter "G." The white pixels 104 are leftblank. The gray pixels will be analyzed for clusters using the BBHKalgorithm.

In this example, the BBHK algorithm examines each row of FIG. 1(a) fromleft to right, and processes the rows sequentially from top to bottom. Asite or pixel position is described by (T, L), where T and L denote theposition of a pixel by row and column, starting with the top left comerof the lattice. Thus, the top left comer pixel is denoted by (1, 1). Thenext position to the right is (1, 2) and the position below (1, 1) is(2, 1). In FIGS. 1(a)-1(c), the algorithm considers no more than thefour nearest neighbors for cluster definition.

The BBHK processing starts by examining pixel (1,1). The processor usingthis algorithm will assign labels as shown in FIG. 1(b), whichillustrates another array 100B. In general, each distinct cluster (orcluster fragment) will be assigned a label, and the number of pixelswithin each cluster is to be accumulated as respective populationcounts. Whether a cluster fragment is deemed a cluster is determined ata later time based on information not yet available. Hence, a cluster"fragment" is a temporary designation. Thus, because pixel (1,1) isgray, the algorithm assigns a label to it--cluster "1" as shown in FIG.1(b) at a position 106b corresponding to position (1,1) of FIG. 1(a).Also, it increments a software or hardware population counter (notshown) that has been (or will now be) assigned or set up for clusternumber 1. Let the population of pixels in cluster number 1 berepresented by N(1). Thus, after one pixel N(1)=1.

Proceeding to pixel (1,2), the algorithm finds that this pixel includesthe feature of interest (gray level data) and now seeks to establishwhether this new site is part of any prior cluster fragment or is a newcluster fragment. By examining the pixel (1,1) to the left, thealgorithm determines that pixel (1,2) belongs to cluster fragment 1.Therefore, it labels that second pixel with a "1" (see FIG. 1(b) andincrements counter N(1) so that now N(1)=2. The pixel at (1,2) is thuspart of the same cluster fragment that includes pixel (1,1).

Continuing to move right within the top row, pixels (1,3) and (1,4) arewhite value pixels and are skipped; only gray pixels are analyzed forclusters in this example. The next gray pixel is at (1,5). Because pixel(1,4) to its left is a white pixel, the algorithm at this juncturedetermines that (for now) pixel (1,5) appears to be part of a newcluster fragment. Hence, as shown at 108b, the processor assigns a newlabel "2" to pixel (1,5) and increments a second counter N(2) to keeptrack of the number of pixels in cluster fragment number 2. Thus, atthis time, N(2)=1. The next pixel is at (1, 6). Because it also is gray,and because pixel (1,5) is gray, the algorithm determines that thissixth pixel belongs to the same fragment cluster number 2 and labelsthis pixel as "2" and sets N(2)=2. Thus, the top row is completed. Notethat no comparisons have been made to any preceding row because this isa border condition and there is no preceding row to the first row.Hence, at the end of scanning the data at the top row, two cluster"fragments" have been identified. It is improper to call them clustersyet, because subsequent processing might show them to be part of thesame cluster (and in this case, they are).

The algorithm now proceeds to the second row and skips pixel (2,1)because it is a white pixel. The first gray pixel is at location (2,2).The algorithm now looks to the nearest neighbors above and to the leftof pixel (2,2). The pixel immediately above it is (1,2) and it is gray.The pixel to the left is white. Accordingly, the algorithm determinesthat gray pixel (2,2) belongs to the same cluster fragment as pixel(1,2). Since pixel (1,2) was labeled as part of cluster fragment 1, thealgorithm assigns the label "1" to pixel (2,2) also. It now incrementsthe N(1) counter to indicate that there are three pixels in clusterfragment number 1, setting N(1)=3.

The next gray pixel along the second row is at position (2,4). Thealgorithm looks at the pixels above and to the left and finds that bothpixels are white. Hence, the algorithm concludes that this new bit ofgray data belongs to a new cluster fragment. Accordingly, as shown at110b, the processor assigns a new cluster fragment label "3" to thispixel and increments a third counter for cluster fragment number 3,setting N(3)=1.

The next gray pixel 112b is immediately to the right at location (2,5).The algorithm compares it to the pixels above and to the left. In thisinstance, the pixel above was determined to belong to cluster fragmentnumber 2, as shown at 108b, and the pixel to the left was labeled asbelonging to cluster fragment number 3, as shown at 110b. This nextpixel (2,5) apparently belongs to "both" cluster fragments. However, itis now seen that "both" cluster fragments are really part of one singlecluster ("they" have contiguous edges). By convention, preference isgiven to the smaller number as the label, and pixel (2,5) is thuslabeled as belonging to cluster fragment number 2. The counter N(2) forcluster fragment number 2 is incremented. The gray pixel 110b atposition (2,4) is not relabeled from "3" to "2". Cluster fragment number3 had only one count so that single count also is added to the count forcluster fragment number 2. Hence, at this point, N(2)=4. Beforeproceeding much further, because the BBHK algorithm has now determinedthat cluster fragment 3 belongs with cluster fragment 2, a record mustbe made so that if another label "3" is later encountered, the siteshould be regarded as part of cluster fragment 2. To do this, a negativenumber is loaded into the population counter for cluster fragment 3. Inthis instance, the algorithm sets N(3)=-2, where the -2 denotes thatlabel 3 points to label 2.

The last pixel in row 2 is at location (2,6) and has gray data. Both itsneighbor above and its neighbor to the left have been labeled asbelonging to cluster fragment 2, and accordingly this pixel also islabeled as belonging to cluster fragment 2. The population counter forcluster fragment number 2 is again incremented, and now N(2)=5.

The examination by the processor using the BBHK algorithm continuesuntil the entire six by six lattice has been examined. A new clusterfragment is apparently encountered at position (3,1) and is assignedlabel 4. However, the next gray pixel is at position (3,2) and is foundto be adjacent to pixels that have been labeled as belonging to clusterfragment number 1 and cluster fragment number 4. Accordingly, by theconvention mentioned above, pixel (3,2) is labeled as belonging tocluster fragment number 1, as shown at 114b. The number count fromcluster fragment number 4 is added to the number count (populationcount) for cluster fragment number 1 so that N(1)=5. Also, the algorithmadjusts the contents of the fourth counter so that N(4)=-1, indicatingthat cluster fragment 4 points to cluster fragment 1. Pixel (3,3) isgray and adjacent to a cluster fragment number 1 site, so it is labeledas belonging to cluster 1.

Pixel (3, 4) is gray and is found to be adjacent to sites of clusterfragment number 1 and cluster fragment number 3. By convention, thispixel is assigned to cluster fragment number 1, and the population countfor cluster fragment number 3 is now added to cluster fragment number 1.However, the record at this moment indicates that the population countfor cluster fragment number 3 equals minus two (N(3)=-2). This indicatesthat cluster fragments 3 and 2 have previously merged. Hence, thepopulation count for cluster fragment number 2 (which includes the countfor cluster fragment 3) is now added to the population count for clusterfragment number 1. Moreover, giving preference to the lower number as ageneral rule, the population count for cluster fragment number 3 isreset to equal (-1) and the population count for cluster fragment number2 is also set to equal (-1). At the end of the third row, it will beseen that although four apparently distinct cluster fragments wereencountered, it has now been determined that they all belong to a singlecluster. Further, the population of this larger cluster, labeled asnumber 1, includes 12 pixels. Thus, N(1)=12. It will be seen in FIG.1(a) that there are no other gray pixels that have an adjacent edge toany pixel that belongs to cluster number 1. During the scanning of thefourth, fifth, and sixth rows, (apparently) new cluster fragments 5, 6,and 7 are encountered and labeled. Hence, array 100B shows labels 1through 7.

At the end of the examination of the entire six by six lattice 100A, theBBHK algorithm has loaded data into seven (population) counters N(k):N(1)=12, N(2)=-1, N(3)=-1, N(4)=-1, N(5)=10, N(6)=-5, and N(7)=-5. Thecounters that have positive numbers correspond to cluster (fragment)numbers 1 and 5, and they provide the cluster numbers for the lattice.This six by six array has been found to have just two clusters, andtheir sizes (populations) are 12 and 10. Labels 1 and 5 are denoted asthe "proper" labels of the clusters because they carry the clusterpopulation number information. The other labels are direct or indirectpointers to the proper labels. The fact that a counter fails to give a"proper" label is indicated by the fact that a negative number is storedin the counter. The magnitude of that number is the label being pointedto (which may identify a cluster).

The prior art BBHK algorithm also describes that labels can be reused.In the previous example, the computer memory needed to store theinformation on all 36 pixels, i.e., all six rows. While this is feasiblefor small lattices, it becomes much more unwieldy and exhausts thecomputer memory capabilities when extremely large lattices need to beanalyzed. Thus, by the BBHK algorithm technique of reusing or recyclinglabels, much less memory is required. This technique is illustrated withregard to FIG. 1(c), which shows an array 100C. In this variation, thecomputer needs to store only two rows, each of size six, at a singletime and reuse the labels. For this example, the processor can assign{1, 2, 3} as the label set for the odd lines and the labels {4, 5, 6}for the even lines, realizing that because each row has only sixelements, the maximum number of labels that could be needed for any rowwould result from the alternating pattern 0-1-0-1-0-1, calling for atmost only three labels per row. Using the labeling recycling,intermediate cluster results are collected whenever a row is completed.The cluster tally results are taken for the row just prior to thecompleted row.

The first row in FIG. 1(c) is labeled exactly as the top row in FIG.1(b). However, the second row is labeled by the label set {4, 5, 6}. Inthis instance, all clusters found in row 1 will next be found to belongto clusters in row 2. Accordingly, all the labels from row 1 will beupdated to point from row 1 to the labels of row 2. Hence, the processorsets N(1)=-4 and accumulates the total number of pixels for clusternumber 4, i.e., N(4)=3. Likewise, the computer sets N(2)=-5 and N(5)=5.Now, when the processor finishes with row 2, it can forget row 1 becauseall the labels from row 1 point to labels in row 2, and it will neverencounter row 1 again. Additionally, note that none of the clusters ofrow 1 are complete because the population counts (N) are all negativeand point to row 2.

The third row will reuse labels {1, 2, 3}. The algorithm now setsN(4)=-1, N(5)=1, and N(1)=12. Again, clusters of row 2 are found to beincomplete because the population counts are all negative.

Proceeding to row 4, the two gray pixels will be assigned the (recycled)label for cluster number 4, and N(4)=2. Now, inspecting the labels ofrow 3, the algorithm notes that the population count for the solecluster is a positive integer, i.e. N(1)=12, which implies that thiscluster number 1 is complete and does not extend to row number 4. Hence,it can be tallied.

The algorithm follows the same procedure for row 5, recycling again thelabels from the label set {1, 2, 3}. At the end of row 5, the algorithmhas set population counters as follows: N(4)=-2, N(2)=5, and N(1)=1.Note that the cluster label "1" is used again in row 5, even though itwas used in rows 1 and 3.

For the sixth row, the algorithm assigns cluster labels from the labelset {4, 5, 6}. In this instance, at the end of row 6, the algorithm hasdetermined N(4)=10, N(1)=-4, and N(2)=-4. Hence, there are no completeclusters for row 5. Further, since row 6 is the last row, the processorknows that N(4) is complete. Hence, using the approach of labelrecycling, in the six by six lattice, the processor using this algorithmidentified and tallied two clusters having populations 12 and10--exactly the same result as obtained with reference to FIG. 1(b).Note that array 100C used only four labels {1, 2, 4, 5,} and that labels3 and 6 were not required.

The BBHK algorithm has found application in many areas. One technologyis X-ray microscopic tomography, in which a sample is divided intovertical cuts. In each plane, the sample is scanned from several angles.For each angle, the fraction of X-ray energy that is absorbed ismeasured. Using the multiple absorption data, the absorptivitycoefficient of each voxel in the cut is determined. The variation inabsorptivity between voxels creates the image. To create discreteclasses for voxels, the absorptivity is partitioned into ranges, whereeach range corresponds to a different class. See, e.g. Kinney et al.,"The X-ray Tomographic Microscope: Three Dimensional Perspectives OfEvolving Microstructures," Nuclear Instruments and Methods in PhysicsResearch A 347 (North-Holland 1994) pp. 480-486; Kinney et al., "InVivo, Three Dimensional Microscopy Of Trabecular Bone," J. of Bone andMineral Research, Vol. 10, No. 2 (Blackwell Science 1995) pp. 264-270;and King et al., "X-ray Tomographic Microscopy Investigation Of TheDuctile Rupture Of An Aluminum Foil Bonded Between Sapphire Blocks,"Scripta Metallurgica et Materialia, Vol. 33, No. 12 (ElsevierScience/Acta Metallurgica 1995) pp. 1941-1946; all of which areincorporated by reference.

The BBHK algorithm enumerated cluster populations in a bounded latticeor array. While proven to have wide-ranging applications, the prioralgorithm fundamentally determined cluster population only. In the 20years since the HK algorithm was introduced, it has not been extendedfor the calculations of interest in shape analysis.

Moreover, the BBHK algorithm was inherently constrained by size and wasnot able to analyze data that is unbounded in at least one dimension.

Furthermore, the BBHK algorithm cannot handle more than one class in asingle pass through the lattice. The present invention addresses suchissues.

SUMMARY OF THE INVENTION

The present invention applies to all prior applications of the BBHKalgorithm, and accordingly applies to all technologies mentionedalready, which is not an exhaustive listing. The invention concerns acluster analysis computer and method, and an image processing system andmethod, useful for the analysis of data that may or may not containclusters. Such data may be obtained illustratively from inspecting itemsin production or use, in pattern recognition, and pattern detection andimage analysis. Data may be obtained from sensing organic tissue (e.g.human or animal), inorganic matter, a workpiece, or item(s) underinspection, although in other applications, the data may be simulated.The invention uses computer algorithms to process the data and reportthe findings.

The cluster analysis computer and image processing system of the presentinvention determines one or more aspects of clusters among the data. Theclusters may represent defects in the workpiece, fissures in solidobjects or objects subject to fissures, topological or topographicalfeatures from aerial or space observation, chemically bonded moleculesor atoms, surface features of a workpiece such as magnetic tape or thelike, crystalline structures, anomalies in bodies where the sensingmethod involves radiology or tomography or other medical processes thatseek anomalies in human or other bodies, and numerous otherapplications.

One independent aspect of the present invention is an "Enhanced HK"(EHK) algorithm that provides the analysis tool to evaluate clusterparameters other than population count for a single class. Suchparameters may include, for example, cluster moments. Importantly,whereas the prior art required multiple passes to determine multipleclasses, the EHK algorithm allows a single pass analysis of multipleclasses.

A separate and independent feature of the present invention is an"Unbounded HK" (UHK) feature which allows cluster parameter calculationsfor an unbounded lattice.

The EHK and UHK features can be combined to form an enhanced unboundedHK ("EUHK") algorithm. In the unbounded method, a multidimensionallattice where at least one of the coordinates is not constrained by sizecan nevertheless be examined continuously and indefinitely.

A further feature of the present invention is that this analysis toolcan still be applied to non-lattice graphs. Indeed, the EHK can be usedwith algorithms that join two labeled cluster fragments and is notrestricted to lattice graphs.

Several summary aspects of the present invention are as follows:

(1) The origin of cluster image data could be transferred to memory fromany sensory device, mass storage such as disk, or be generated by thecomputer through prior art methods.

(2) Sites are stored in data structures in memory. The position of asite can be inferred form the sites' data structure. The site datastructure also defines the class of the site.

(3) For each site i, f.sup.(n) (i) properties are defined, where ndefines the type of property. For each site there could be one or moresuch properties.

(4) Each cluster is composed of sites. For a site to be a member of anexisting (already-found) cluster, it has to be a neighbor of one memberof the cluster and it must be of the same class t of other members ofthe cluster.

(5) Clusters and cluster fragments are represented by their classidentifier and a set of labels. Of the set of labels, one label k isconsidered to be the proper label of the cluster. The other labels ofthe cluster point directly or indirectly to the proper label. By thepresent invention, associated with this label are cluster parametersF.sup.(n) (k, t), where the term F.sup.(n) (k,t) denotes a cumulativeproperty of the site f.sup.(n) (i) property for sites of class tbelonging to the cluster.

(6) When the sites are scanned in memory and site i is found not to haveneighbor sites of the same class or that its neighbor sites have notbeen labeled yet, a new cluster label k is created by prior art for thatcluster. By the present invention, its F.sup.(n) (k; t) parameter isinitialized to the site i f.sup.(n) (i) property.

(7) When the sites are scanned in memory and a site i is found to haveneighbor sites of the same type and that its neighbor sites are alreadylabeled, that site is merged with the cluster fragment to which thesemembers belong. Merging of clusters is known in the prior art. By theprior art technique, the proper labels of each of the cluster fragmentsis determined and a new proper label p is determined for the mergedcluster. The value of p can be either one of the proper labels of thecluster fragments or it can be a new label. By the present invention,the cumulative property F.sup.(n) (p,t) of the merged cluster isdetermined by a performing a successive general binary operation ⊕ onthe cluster fragments. If there are m cluster fragments denoted by thek₁, k₂, . . . , k_(m) cluster labels where each label is distinct, acumulative value for the merged cluster is evaluated as follows by thepresent invention:

F.sup.(n) (k₁,t)⊕F.sup.(n) (k₂,t)⊕ . . . ⊕F.sup.(n) (k_(m),t)⊕ f.sup.(n)(i). That value is stored in F.sup.(n) (p,t). The operation ⊕ isrequired to be commutative and associative. It can represent scalar orvector addition, scalar multiplication, set union, set intersection,selection of a maximum or minimum value of its operands, or otheroperations.

(8) The output of the analysis could be the F.sup.(n) (p,t) parametersand other parameters that can be derived from the F.sup.(n) (p,t)parameters. The output can be printed, displayed on a console, or storedin a mass storage device.

(9) Moreover, in a special case according to the present invention, thef.sup.(n) (i) quantity can represent x(i)^(a) y(i)^(b) z(i)^(c) wherex(i), y(i), z(i) represents the x,y,z coordinates of the i'th site anda,b,c are non-negative integer exponents. Here the operation ⊕ woulddenote the normal addition operation. The F.sup.(n) (p,t) parameterwould be cluster moments for x, y, and z.

(10) In another special case according to the present invention, thef.sup.(n) (i) quantity is 1 if its on cluster perimeter, otherwise it is0. The operation ⊕ would denote the normal addition operation. TheF.sup.(n) (p,t) parameter would be the count for the cluster perimetersites.

(11) In another special case according to the present invention, thef.sup.(n) (i) quantity represents x(i), the x coordinate of the i'thsite. The operation ⊕ would denote selection of the maximum value of theoperands. F.sup.(n) (,pt) denotes the maximum x coordinate value for thecluster. Similarly, another ⊕ could define the selection of the minimumvalue of its operands. In that case, F.sup.(n) (p,t) would denote theminimum x value for the cluster. By repeating these operands for y andz, the method and apparatus would obtain the bounding box for thecluster.

(12) In another special case according to the present invention, thef.sup.(n) (i) quantity represents the number of neighbor sites of site ithat already have been scanned. The operation ⊕ would denote the normaladdition operation. F.sup.(n) (p,t) would denote the number of thecluster's internal edges.

(13) In still another special case according to the present invention,the f.sup.(n) (i) quantity represents the number of non-neighbor sitesof site i. The operation ⊕ would denote the normal addition operation.F.sup.(n) (p,t) would denote the number of edges from the clusterperimeter sites pointing to non-cluster sites.

BRIEF DESCRIPTION OF THE DRAWINGS AND APPENDICES

In describing the detailed embodiments of the present invention,reference is made to accompanying drawings and appendices, wherein:

FIG. 1(a) represents gray pixel data to be analyzed and FIGS. 1(b) and1(c) are both developed from FIG. 1(a) using the prior art HK algorithm,but FIG. 1(c) is developed using HK label recycling;

FIG. 2 shows a flow chart of an EHK algorithm;

FIG. 3 shows an inspection system for an unbounded workpiece that isbeing analyzed according to the present invention;

FIG. 4 shows a generalized graph having supernodes to be analyzed usingthe present invention;

FIG. 5 shows an architecture suitable for use in an embodiment of thepresent invention;

FIG. 6 illustrates how a register points to a location in memory, in anapplication of the present invention;

FIG. 7 is used to illustrate how a color image can be classified intotwo classes "1" and "2" based on red and green components of the imagecolor;

FIG. 8 illustrates a set of vector data points, on which a reticulatedgrid has been overlain, which may be used in the present invention;

FIG. 9 illustrates how a workpiece that is unbounded in one dimensionmay be analyzed according to a feature of the present invention;

FIG. 10 illustrates a five by six array that contains multiple classes,all of which may be analyzed by the present invention simultaneously;

FIGS. 11A, 11B, and 11C represent data structures that may be employedin practicing the present invention;

FIGS. 12A and 12B illustrate data structures in the form of an N arrayand an F.sup.(n) array, respectively, as may be used in the presentinvention;

FIG. 13 illustrates an alternative data structure to FIG. 12 useful inpracticing the present invention;

FIG. 14 illustrates another alternative data structure useful inpracticing the present invention.

Appendix 1 shows an EHK algorithm listing corresponding to datastructure of FIG. 12; and

Appendix 2 shows an EUHK algorithm listing;

Appendix 3 shows an EHK algorithm listing corresponding to datastructure of FIG. 13; and

Appendix 4 shows an EHK algorithm listing corresponding to the flowchart of FIG. 2 and the data structure of FIG. 14.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

The present invention provides an analysis tool for the examination ofphysical observations or events. That is, data may be derived from anyof the forces known in the universe, whether gravitational, electrical,magnetic, or nuclear. This obviously includes any combinations thereofsuch as the electromagnetic spectrum. Consequently, the presentinvention extends to image processing where an image is derived from theuse of any portion of the electromagnetic spectrum, whether infrared,visible, ultraviolet, or beyond. The invention also applies to MRI(magnetic resonance imaging), ultrasound imaging, electromicroscopy,electron tunneling microscopy, and any other parameters that can beassigned space coordinates. This includes such parameters as temperatureand pressure also. Physical observations may moreover be obtained byelectron microscopy, or any other technique for the collection of data.After the physical data (or other types of data, such as observation ofbehavior, for example) are collected, the data may be processed into agraph or an array of any number of dimensions. The present invention isnot restricted to the particular manner of obtaining data.

The Enhanced HK Feature

The enhanced HK feature (EHK) permits the calculation of new clusterparameters (beyond the population count) for multiple classes within thecontext of the HK algorithm. These enhancements fit naturally with theUHK features described below.

Turning attention to FIG. 2 and Appendix 4, Appendix 4 is a commentedprogram listing setting forth an example of the EHK features, and FIG. 2is a flow chart of the enhanced EHK algorithm. In the BBHK algorithm asdescribed above, the information sought is concerned with the number ofclusters within a bounded border lattice and the population count foreach such cluster. Suppose now, instead, one desires to examine adifferent property.

Thus, let f.sup.(n) (i) be some nth property of the i'th site. For oneexample, the function value could be zero if all sites adjacent to sitei are members of the same cluster. The expression f.sup.(n) (i) could bea scalar, a vector, or a general array. For example, f.sup.(n) (i) coulddenote that site i is on a cluster perimeter: when the site is on theperimeter, then f.sup.(n) (i) is set to 1, but otherwise it is set to 0(to indicate a site not on the cluster boundary).

Another f.sup.(n) (i) could denote the x coordinate of the i'th site sothat f.sup.(n) (i)=x_(i). As another example, f.sup.(n) (i) could alsodenote a non-scalar quantity such as a vector, e.g. f.sup.(n) (i)=(x_(i)², y_(i) ², z_(i) ²).

Thus, FIG. 2 shows a subroutine or flow chart 200. Much of FIG. 2corresponds to a similar flow chart in my 1976 article promulgating theHK algorithm, but at least portions Ψ and Ξ are new. The subroutine inFIG. 2 analyzes all clusters that belong to non-background classes.Background classes are classes that are not analyzed for cluster by thealgorithm. At box 212, the algorithm, working in a programmed computerillustratively, determines whether any further lattice sites need to beexamined. If such a further site exists, then the processor advances tobox 214 in the program where a determination is made as to whether thenext site i belongs to a non-background class t. If not, the processorproceeds to the next lattice site, indicated at 216.

However, if site i is in fact found to contain a class t characteristic,then the processor advances to box 218. At 218, the processor examinesthe sites neighboring of site i. If there is no neighbor found that alsobelongs to class t then site i must be part of a 2 new cluster. In thiscondition, the program advances to box 220 where site i is assigned anew cluster label. In FIG. 2, Greek letters denote cluster fragments.Hence, a first cluster fragment to be encountered may be named α, thesecond one β, and so on. (For example, in FIG. 1(a), the processing ofarray 100A located 7 different cluster fragments, but these sevenfragments were determined to comprise just two clusters. These clusterfragments merge into just two clusters.) Let k represent the number ofcluster fragments that have been numbered separately. Hence, until afirst cluster fragment is found, k=0. When a first cluster fragment isfound, k is incremented, and is incremented each time another clusterfragment is found.

This is conveniently denoted by the expression k←k+1, which means storethe value of the expression to the right of the arrow in the memorylocation pointed to by the variable to the left of the arrow. This useof the (left-pointing) arrow ← is used herein in the figures andappendices.

The program next advances to box 222 which is divided into two sections,222a and 222b separated by a broken line. In portion 222a, theoperations described with reference to the BBHK algorithm are executed.In particular, the site i is assigned to the new cluster label and thepopulation counter for that cluster label is initialized. In portion222a, the expression S_(i) means site i, and S is an array that storesall the lattice sites in sequential position in computer memory. Thevariable i denotes the position of S_(i) relative to the first arrayelement in memory. Variables i and k are integer identifiers stored inmemory. As mentioned, k is a count that increments whenever a newcluster fragment is encountered so that a new label can be created.

The first line in box portion 222a is S_(i) ←m₁.sup.η =k+1. This meansthat the new cluster identifier will be the letter η, that the elementat site i is labeled as m₁ ⁷² and that there is now one further label.The second line in box portion 222a is N(m₁.sup.η)←1, which means thatthe population counter for the new cluster fragment is set to a count of1 (and it is thereby initialized). The variable N is an array that ismaintained in computer memory. If an element of array N is positive, itdenotes the number of sites in the cluster fragment and its indexdenotes the cluster's proper label if negative, it points to a properlabel directly or indirectly (via an intermediary). (FIG. 14 representsan N array in computer memory, where some elements of the array arepositive and others are negative. FIG. 14 will be discussed infra.)

Turning now to box portion 222b, which concerns the EHK feature, thecluster analysis processor or computer performs further operations. Theexpression F, as N, is another cumulative value. These steps again areinitialization of the variables shown in portion 222b, namely F.sup.(1),F.sup.(2) and so on. Variable F.sup.(1) is a general quantity whichcould denote a summation over all x values of the sites in the clusterfragment, which is the first moment for x. Variable F.sup.(2) is anothergeneral quantity which could be, for example, the first moment for thecluster sites in the y direction. The F and N values are array elementsthat are stored sequentially in computer memory. (FIG. 14 shows anF.sup.(n) array.) In the operations shown in FIG. 2, the first generalquantity F.sup.(1) for m₁.sup.η is set to f_(i).sup.(1). The secondgeneral quantity F.sup.(2) for m₁.sup.η is set to f_(i).sup.(2).Following the initialization of the F.sup.(n) quantities, the variable Cin portion 222b is also initialized. C identifies the class of thecluster fragment. (FIG. 14 shows a C array.) In the operation shown inFIG. 2 the quantity C for m_(i).sup.η is set to the class t of site i.When these operations are finished, the program proceeds to box 224,which routes back to the top of FIG. 2.

Concerning box 218 again, in the event that a search of the neighboringsites determines that a neighbor has indeed been found, the programinstead of advancing to box 220 will advance to box 226. The routineCLASSIFY is applied. The CLASSIFY routine is given in my Oct. 15, 1976article cited supra, which is incorporated by reference. A version ofCLASSIFY is given in Appendix 4.

The processor locates cluster fragments and labels them in memory. Asnoted, α, β, γ etc. each are cluster fragment identifiers. They aredynamic entities that change during processing until complete clustersare identified. Thus, the processor may identify several array elementsas part of cluster α. Each cluster could be multiply labeled by the HKalgorithm so that each cluster α is represented by a set of α labels:

    {m.sub.1.sup.α, m.sub.2.sup.α, m.sub.3.sup.α, . . . , m.sub.s.sup.α, . . . ,.sub.2.sup.α }

In this notation, the superscript indicates labels of a given cluster(or cluster fragment if the cluster is not complete. It is seen herethat all elements of this set are for cluster a). Hence, using thisnotation on FIG. 1(b), for example, the first entire cluster wouldcomprise the set of labels {1, 2, 3, 4}. So, m₁.sup.α =1, m₂.sup.α =2,m₃.sup.α =3, and m₄.sup.α 4. In such case, m_(s).sup.α =m₁.sup.α becausem₁.sup.α is the smallest in the set. The processor chooses m_(s).sup.α,the smallest value in the set, to be the "proper label" for the cluster.Accordingly, the value of N(m_(s).sup.α) is positive. The otherN(m_(p).sup.α) in the set (where p=1, 2, . . . , 4 and p≠s) are negativeand point directly or indirectly to the proper label N(m_(s).sup.α). Thechoice of the proper label as the smallest value in the set is notmandatory for the HK algorithm. Any other value in the set would do justas well.

The processor now proceeds to box 228 having a portion 228a and 228b.Portion 228a corresponds to the contents of a similarly-located box inthe flow chart appearing a FIG. 1 of my 1976 article. In portion 228a,site i is given its label classification, the population counterN(m_(s).sup.α) is updated to equal the sum of the population countersfor the cluster fragments plus 1. The "1" stands for the addition ofsite i into the population count N(m_(s).sup.α) for cluster α. The otherN(m₄.sup.α) are negative and no longer serve the role of populationcounters. They point to the proper label m_(s).sup.α of the mergedcluster.

The processing represented by 228b is new. Portion 228b (correspondingto the symbol Ξ) represents further operations that are both associativeand commutative. The illustrative values f.sup.(1) (m_(s).sup.α) andF.sup.(2) (m_(s).sup.α) are updated, and the resulting new cumulativevalue is stored. The processor then advances to box 216, from which itwill return to box 212 to determine whether there are anymore latticesites. Once all the lattice sites have been examined, the processingstops, as indicated at 230.

With further regard to operations that may be performed in processingportion 228b, in using the EHK algorithm, the symbol ⊕.sup.(n) denotes ageneral binary EHK operation. It could be either a scalar or non-scalaroperation. The only requirement on the operation is that it must beassociative and commutative. Addition, scalar multiplication, set union,and set intersection are examples of such operations. Also, functionssuch as max(a,b) and min(a,b) can also define such operators. In twodimensions, these functions enable the calculation of a bounding box fora cluster, by defining ⊕.sup.(n) as x_(i) ⊕.sup.(n) x_(j) .tbd.max(x_(i),x_(j)). Accordingly, when merging two partial clusters, thisoperation picks the max x value of the two operands.

Cluster moments, which characterize the shape of the clusters, are veryimportant properties for image analysis. They can easily be determinedusing the EHK algorithm. Such moments can be defined in Equation (1):##EQU1## Here, the left side of Equation (1) denotes the n'th moment ofthe k'th cluster for the j'th coordinate, where j=1, 2, . . . d, andwhere d is the space dimension. On the right side of Equation (1), thevariable x_(j) ^(n) (i) is the j'th coordinate raised to power n for thei'th site belonging to the k'th cluster. These moments can be calculatedusing the EHK algorithm. Referring to Appendix 1, the f.sup.(n) (i)quantities in equations <1> to <3> and <7> to <9> would be identifiedwith the x_(j) ^(n) (i) coordinates. The quantity F.sup.(n) (k(t),t)(where t is the cluster class) in equations <1> to <9> would beidentified with the partial moments X_(j).sup.(n) (k).

Following Pratt, Digital Image Processing, 2nd Ed. (Wiley, & Sons 1991),one can define a general cluster moment M_(k) (m(1), m(2), . . . , m(j),. . . , m(d)) for all coordinates in d dimensions as follows: ##EQU2##where ##EQU3## In Equation (3), the symbol II denotes a product of thecoordinates x, over d dimensions, and m(r) is the power that x_(r) israised to for the r'th dimension. The summation is carried over allsites i of the cluster k. Equation (1) is a special case of Equation(2), where

    X.sub.j.sup.(n) (k)=M.sub.k (m(1)=0, m(2)=0, . . . , m(j)=n, . . . , m(d)=0)(4)

In Equation (4) only the exponent m(j) is non-zero, so that the productin Equation (3) degenerates into a single factor x_(j) ^(n) (i).

Now, the processor can calculate the following f^(n) (,i) properties andequate them with C_(i) (m(1), m(2), . . . , m(j), . . . , m(d)). It canalso equate the partial moment M_(k) (m(1), m(2), . . . , m(j), . . . ,m(d)) with the quantity F.sup.(n) (k(t),t) of Appendix 1. Equation (2)can be further generalized by equating the quantity f.sup.(n) (i) with alinear combination of C_(i) (m(1), m(2), . . . , m(j), . . . , m(d))quantities defined by Equation (3): ##EQU4## The summation in Equation(5) is over all values of m(1), m(2), . . . , m(j), . . . m(d) which aredenoted symbolically by p. The quantities a_(p) are constant values. Aspecial case of Equation (5) is the sum of the first moments in twodimensions:

    f.sup.(n) (i)=x.sub.1 (i)+x.sub.2 (i)                      (6)

In Equation (6), x₁ (i) is the x position of site i and x₂ (i) is the yposition of site i. The F.sup.(n) (k(t),t) quantities for the k'thcluster would now be given as a linear combination of the generalizedmoments given by Equation (2). Thus, see Equation (7): ##EQU5##

Some examples are now given of some cluster parameters that can bedetermined using the EHK feature. In image shape analysis, perimetersand moments are important features. A first example of a clusterparameter that can be calculated using EHK processing is the firstmoment of a cluster for one coordinate. In a second example, the samecalculation is performed where label recycling is used. In a thirdexample, a moment of inertia in two dimensions will be processed. Itwill be understood that cluster moments in higher dimensions can beprocessed.

EXAMPLE 1

The EHK feature can be applied to the image data of prior art FIG. 1. Itcan be used, for example, to calculate the first moment of the xcoordinate. First, to specify an illustrative coordinate system, let thepositive direction of the x axis be to the right and let the positivedirection of the y axis be downward. Hence, in FIG. 1(b), sites aredescribed in the format corresponding to (row, column). That is, the topleft corner is site (1,1) for row 1, column 1. The bottom left corner is(6,1). The top right corner is (1,6), and the bottom right corner isthus (6,6). Thus, the first moment of a cluster element at site (1,6)will be 6, because this column is 6 units away from the referenceposition (0,0) (not shown in the figure) in the x direction. Hence,f.sup.(n) (i)=x(i) for this example.

Using EHK processing, the first moment for each cluster may bedetermined and stored in a one dimensional array X. The size(population) of array X will be the same as that of array N, thelabeling array.

Starting with FIG. 1(b), the top left corner is the site (1,1). Thevalue of x(1,1) is 1, and the proper label site is 1. So the initialvalue for the sum of x moments for cluster number 1 is 1. This isexpressed as X(1)=1.

For site (1,2), the first moment x(1,2)=2, and the proper label is 1.Since X(1) is to represent the total value of x moments of all elementsof cluster number 1, the process must add 2, which is the x value of(1,2), to X(1). Hence, new X(1)=3.

Sites (1,3) and (1,4) were not marked "G" in FIG. 1(a) and thus are notlabeled as belonging to any cluster. As such, they do not contribute toany component of array X. The next lattice site containing a clusterfragment is site (1,5), which is labeled for cluster 2. Since thiscluster has not yet been encountered in determining the first moments,the processor now initializes X(2)=x(1,5)=5.

Continuing in this fashion, site (1,6) is labeled as part of clusternumber 2, and the x moment of this site is 6. Hence, new X(2)=5+6=11.

In the next row, the first cluster fragment is located at site (2,2),which is labeled for cluster 1. Its first moment is 2. Adding this tothe prior X(1) value of 3, new X(1)=5.

Next, site (2,4) is labeled for cluster 3. Hence, the value for X(3)will now be set to 4.

Site (2,5) has a proper label of 2. Cluster fragment label 3 points tolabel 2. Hence, the new value for X(2) will be the prior X(2) plus X(3)plus x(2,5). That is, new X(2)=11+4+5=20. Next, site (2,6) is part ofcluster number 2. Its first moment x is 6, so X(2)=26 at the end of thesecond row.

In the third row, a fragment of a new cluster 4 is encountered at site(4,1). Hence, X(4)=1. At site (3,2), which has an x moment of 2,fragments of cluster numbers 1 and 4 are merged. As a consequence, newX(1)=5+1+2=8. At site (3,3), another proper label 1 is found, so the xmoment of 3 for this site is added to the ongoing accumulation, and newX(1)=8+3=11.

Site (3,4) (see 110b) merges two clusters which until this site hadproper labels of 1 and 2. Label 1 is chosen as the new cluster labelbecause it has the lower value and in the convention adopted, the lowercluster number is chosen. Thus, cluster label 1 is chosen as the newproper label. The new accumulation for X(1) will be the prior value ofX(1) (which was 11) plus the prior value of X(2) (which was 26) plus thex moment 4 for site (3,4). That is, there is an update so that newX(1)=11+26+4=41. No further cluster fragments are encountered in thethird row.

In the fourth row, the first cluster fragment is at site (4,5) and isassigned cluster label 5. As the x moment of this site is 5, now X(5)=5.At site (4,6), another fragment of cluster number 5 is found. The xmoment of 6 for this site is added to the prior accumulation, soX(5)=5+6=11.

The first cluster fragment found in the fifth row is located at site(5,2). It is assigned cluster number 6, a new cluster number. Hence,X(6)=x(5,2)=2. Another fragment of another cluster number 7 is found atsite (5,4). Thus, X(7)=x(5,4)=4. At site (5,5), a cluster merger occursfor cluster numbers 5 and 7. The surviving proper cluster label iscluster number 5 for the merged clusters. Hence, adding the prior X(7)and X(5) and the current x(5,5) yields new X(5)=11+4+5=20. At site(5,6), another fragment of cluster number 5 is found, and X(5) isupdated to X(5)=20+6=26.

In the bottom row, the first cluster fragment found there is at site(6,2) and is part of cluster number 6. Hence, new X(6)=oldX(6)+x(6,2)=2+2=4. At site (6,3), another fragment of cluster number 6is found. Its first moment x=3 so new X(6)=4+3=7. At site (6,4), anotherfragment is found. It is nearest to cluster numbers 6 and 7. However,cluster number 7 was merged with cluster number 5. Hence, at site (6,4),a merger occurs, with the surviving cluster number of 5. Consequently,new X(5)=old X(5)+X(6)+x(6,4)=26+7+4=37. At site (6,5), another fragmentof cluster number 5 is found, so X(5) is updated to include x(6,5).Thus, new X(5)=37+5=42.

No further cluster fragments are encountered in the lattice 100B of FIG.1(b). Therefore, proper labels 1 and 5 survive as the only properlabels. The cluster moments are X(1)=41 and X(5)=42.

EXAMPLE 2

EHK processing may be applied to FIG. 1(c) also. FIG. 1(c) uses BBHKwith label recycling, whereas FIG. 1(b) uses BBHK without labelrecycling.

At the end of row 1, X(1)=1+2=3 and X(2)=5+6=11. At the end of row 2,X(4)=X(1)+x(2,2)=3+2=5. Also,X(5)=X(2)+x(2,4)+x(2,5)+x(2,6)=11+4+5+6=26. At the end of row 3, label 1is the only surviving proper label, and X(1)=41. This completes thefirst cluster because it does not extend to row 4.

At the end of row 4, X(4)=11. At the end of row 5, X(1)=2 andX(2)=X(4)+x(5,4)+x(5,5)+x(5,6)=11+4+5+6=26. At the end of row 6,X(4)=X(1)+X(2) + the x moments of the four cluster fragments of clusternumber 4 in the bottom row. Thus, X(4)=2+26+2+3+4+5=42.

When label recycling is employed, the process preferably outputs resultsfor completed clusters for a completed row because the label will berecycled. If results for completed clusters at completed rows, thatinformation would be lost unless other steps are taken to preserve it.

EXAMPLE 3

Another example of a cluster parameter that can be calculated using EHKprocessing is the moment of inertia of a cluster in two dimensions. Itwill be understood that more dimensions can be used.

Consider a lattice of point masses at each lattice site. The mass ofeach site is taken to be unity. The moment of inertia I_(s).sup.(2) of acluster s in two dimensions around an axis perpendicular to the X-Yplane and going through the center of the mass of the cluster is givenby Equation (8): ##EQU6## In Equation (8), the summation is taken overall sites s of the cluster. One may seek to define an equivalent clusterellipse which would have the same mass s and the same moment of inertiaI_(s).sup.(2) as the cluster in question. The moment of inertia for anellipse around an axis perpendicular to its plane and going through theintersection of the ellipse principal axes is given by the followingEquation (9): ##EQU7## where s, a, and b are the mass, the principalaxis and the minor axis of the ellipse, respectively. For a squarelattice, assume that the distance between nearest neighbors is 1 so thateach site is associated with a unit area. Therefore, the area of theeffective ellipse is also s. Hence, the area of the equivalent ellipseis given by Equation (10):

    s=abπ                                                   (10)

In Equations (9) and (10), a and b are unknown quantities. The valuesfor s and I_(s).sup.(2) ##EQU8## can be calculated using the EHKalgorithm for the summations in Equation (8). Solving for a and b yieldsEquations (11) and (12).

The eccentricity e of the cluster is given by Equation (13). ##EQU9##

The quantities of interest are a, b and e can be calculated byprogramming Equations (11), (12) and (13) into the computer. They can becalculated as part of the EHK processing when the EHK completesprocessing a cluster. They provide the image processing system withinformation on the shape of the cluster. The values of e lie between 0and 1. If e=0, then the cluster is likely to be more circular. As eincreases the cluster is likely to be more elongated. The a and b valuesprovide information on the linear dimensions of the cluster.

The Unbounded HK Feature

For the UHK feature, when dealing with a d-dimensional lattice problem,the lattice has finite bounds in (d-1) dimensions. In the remainingdimension, under the UHK feature, lattice size is not bounded. Forexample in 2-D, the extension provided by the UHK feature enables thescanning of a continuous (unbounded) tape of a given (bounded) width forsurface clusters. With the UHK, there would be no limits on the lengthdimension of the tape; only its width is required to be bounded.

The same principle applies to 3-D objects. A 3-D object, such as anunbounded cylindrical pipe, could be continuously scanned along itsheight dimension under the UHK feature. As the cylinder is scanned,cluster numbers and other cluster parameters can be evaluated andanalyzed in real time. In the 3-D version of the unbounded algorithm,there is no restriction on the shape of the 2-D cuts scanned. The onlyrestriction is that they would be bounded in 2-D. The UHK feature isindependent of the specific version of the HK algorithm used. Itincludes also multi-processor versions of the HK algorithm.

One example of a 3-D application that could benefit from clusteranalysis is x-ray tomography of cracks in rocket fuel. The cracks couldbe looked at as clusters. Again, the cluster analysis could be done inreal time, continuously, using the EUHK algorithm. Another 3-Dapplication would be neutron scattering for explosives detection wherethe voxel classes would be defined by chemical elements and thescattered γ-ray intensity.

EXAMPLE 4

FIG. 3 illustrates a system that applies the EUHK algorithm foranalyzing in real time the surface cluster structure of a continuoustape 240 scanned by a surface scanning device 242 such as a microscopeor interferometer. Scanning device 242 scans tape 240 along thedirection of the width of the tape. The tape 240 is moved from right toleft by a tape drive 244 under control of a tape motion controller 246.The scanned data is digitized and provided in digital form to a computer248. For example, the digitized data may represent surface heightrelative to some reference level. Also, further digital data maycomprise a position measure that would enable the cluster processingcomputer 248 to determine the x, y position of a pixel on the taperelative to the beginning of the tape. Using the height information,computer 248 would assign a class value to this position. For example,the surface roughness height range could be subdivided to sub-rangeswhere each sub-range would be defined as different class and assigned aunique numerical value. Other examples of surface entities could besurface imprints. These imprints could be made from different materialswhich could be determined by x-ray scattering.

Once computer 248 assigns class values to a line of pixels, that scannedline information is read by the "merge-cluster2" program shown inAppendix 2. These scanned lines are read one by one by the computer andanalyzed for clusters. The computer 248 controls the scanning device 242rate of scanning and tape motion. If the "merge-cluster2" programdetects abnormal surface clusters, it alerts the operator by displayingthe cluster position and other parameters on a computer console 250. Thecomputer could also take an automatic action, such as shutting down thescanner and the tape motion controller 246. The program could also bestopped from a keyboard 252 through the invocation of a stop-programprocedure.

For the arrangement of FIG. 3 to work, computer 248 must process thelines faster than the rate at which scanning device 242 scans the tape240. If it does not, the solutions to this problem are:

1. reduce the scanning rate of the scanning device 242; or

2. increase the speed of computer 248; or

3. use a multi-processor computer system.

The HK algorithm can be parallelized by partitioning the lattice intosections of lines, where each processor can work on its sectionsindependently of each other. These sections are stitched togetherthrough the HK multi-labeling process.

In the Appendix 2 example of the EUHK, clusters are analyzed after theyare completed. However, the processor does not have to wait until acluster is completed before it is analyzed. Since clusters are analyzedat the end of each row, it is also possible to analyze cluster fragmentswith this method. The program can be easily modified for looking atcluster fragments after the completion of every row. The same clusteranalysis could be done for the cluster fragments as the one done for thecomplete cluster. If the cluster fragments shows abnormality, the alertprocedure could be called.

FIG. 3 depicts an application where microscopic features of a surfaceare analyzed for clusters. The same approach could be used for air orspace surveillance. For example, a satellite could be continuouslyscanning stripes of land. This is essentially the same approach asapplied to the scanning device 242 of FIG. 3. In this case, the scanningdevice (the satellite) is the moving object and the scanned object (theEarth) is considered to be the stationary object. Also, in the FIG. 3scanner example, pixel sizes are measured in microns or nanometers. Forspace surveillance, the "pixels" would be measured in meters. Yet, thesame cluster analysis principles would apply to both.

Appendix 2 demonstrates the application of the EUHK to a 2-D problem.However, the same principles would apply to 3-D problems. Instead ofscanning and analyzing the subject matter line by line, in 3-D, oneanalyzes the subject (object) plane by plane or in groups of planes.Also, one would be clustering voxels instead of pixels. In a simplecubic lattice, there would be six nearest neighbors instead of four forthe square lattice. So the program would have to check for two nearestneighbors in the same plane and for a nearest neighbor in the previousplane.

The analysis described here does not apply only to square or cubiclattices. It could be used for any lattice structure in all dimensionsand not just for the nearest neighbors.

Non-Lattice Extension for the EUHK Algorithm

The previous section discusses the use of the EUHK algorithm for two andthree dimensional lattices. Such processing could be further extended tomore general graphs such as given in FIG. 4. One can apply the EUHK forthe graph represented in this figure. This graph consists of linearlyconnected supernodes 1, . . . i -1, i, i +1. . . . Each supernodeconsists of regular nodes labeled as i, k where i is the supernode indexand k is the k'th regular node in the i'th supernode. In this graph,edges exist between regular nodes only if they belong to the samesupernode or to two adjacent supernodes. A supernode edge, shown as aheavy line, exists between two supernodes if there exists at least oneedge between two regular nodes belonging to the two supernodes.

By partitioning a graph into a linear form of supernodes as representedby FIG. 5, one can apply the EUHK algorithm. As in the lattice case, oneneeds only to store adjacent sets of supernodes in computer memory.Also, one could reuse labels, just as explained above for the 2-D and3-D lattices. For example, one set of labels would apply for the oddsupernodes and another set for the even supernodes. The number of labelsets depends only on the number of supernodes that are kept in memory.One may advantageously assign one set of labels per supernode or to asubset of sequential supernodes.

Cluster Analysis Computer

It will be understood from the prior description that FIG. 3 describesan image processing system. This invention deals with an improvement inthe computer component (cluster processing computer) 248. FIG. 5presents an example of a block diagram of a computer 260 that can beused for cluster analysis. Computer 260 includes a system bus 262 thatconnects memory banks 264a, b, c, d, CPUs (central processing units)266a, b, c and I/O devices via an ,I/O controller 268. The CPUs 266execute programs that are stored in memory. Each CPU consists of an ALU(arithmetic logic unit) that executes machine instructions, PCU (programcontrol unit) that controls program execution, and a collection ofregisters, which are used in data computation. One of the registers is aprogram counter that points to memory code instruction in memory that isto be next executed. Some CPUs contain cache memory to speedcomputation. Memory stores both data and machine code. The memory maycomprise DRAM, SRAM, ROMs, or other random access non-volatile storage.Each memory bank 264 consists of a vector of memory words. Because ofthe nature of a memory bank, accessing any word takes a fixed amount oftime. The CPUs on the bus can access any memory bank through a CPUregister that points to the specific memory. The computer 260communicates with the outside world through various I/O devicecontrollers 268, examples of which are console, disk, serial lines,sensory devices, etc.

FIG. 6 shows a memory bank 264a. The memory elements may be randomlyaccessible devices such as DRAMs or SRAMs, or may comprise other formsof memory, whether volatile or non-volatile, and may be solid-state ormagnetic. The nature of the memory device is chosen by the circuitdesigner, but ordinarily would be DRAM. The organization (as opposed tothe accessibility) of the memory is sequential and linear for each bankand can be represented by a vector. Accessing the k'th word is usuallydone by a CPU register 270 that contains the value k as the memoryindex.

The cluster processing computer 260 receives image data through an I/Ocontroller 260. In FIG. 3, the input comes from a real time scanner.However, the input can come also from a mass storage medium such as adisk or a magnetic tape. Alternatively and/or additionally, the computercan also generate the input data through simulation. The input datawould usually come in one of two forms, in a raster form or a vectorform.

In a raster form, the data would be represented by an array. Each arrayelement would correspond to a lattice site. In two dimensions, thiswould correspond to an array of pixels, and in three dimensions thatwould be an array of voxels. Data represented as a 2-dimensional or3-dimensional array is stored in computer memory as a one dimensionalarray because computer memory 264a shown in FIG. 6 is organized as a onedimensional array.

Further data structures represented by FIGS. 10 through 14 are suitable.In general, for storing a 2-dimensional array, assuming that the imageconsists of R×C pixels, the first R row elements would be stored first,followed by a second R row elements, and so on. In some computerlanguages, for two-dimensional arrays, the columns are stored one afterthe other. (In the preferred embodiment, only few rows have to be storedat any point in time).

The data that is stored for pixels (voxels) is its information content.This data could be either a scalar or a vector. The pixel informationcontent could be gray level, color, electromagnetic frequency ranges andintensities, pressure and temperature vectors, and so on. The values ofthe vector components could be discrete or continuous.

To illustrate this point, consider color. Color can be defined by avector of three components--Red, Green and Blue (RGB), for example.These components each assume values from 0 to 1. FIG. 7 illustrates thispoint and shows a two component color R and G represented in a Cartesiansystem. Each point in the universe of colors defined by an imaginarysquare (0,0), (1,0), (1,1) and (0,1) represents an RG combination. Forcluster analysis, one must classify this color space. For example, RGpoints that fall inside one of the two rectangular shapes 272, 274belong to class 1. Those that are inside the irregular shape labeled by2 belong to class 2. On the other hand, those data points that are notin 1 nor 2 belong to the background class 0. Only pixels belonging toclass 1 or 2 are analyzed for clusters.

The pixel values do not have to denote colors. For example, they coulddenote temperature and pressure (TP). However, the same principle ofclassifying would apply. Instead of classifying color values, thecomputer or processor would classify TP values. Just as the colors canbe represented by a three dimensional space, the TP classifications forma two dimensional space. These spaces need not be two or threedimensional spaces, but can be of any dimension. Such spaces are knownas feature spaces. The paper by S. Haimov et al. in IEEE Trans. onGeoscience and Remote Sensing, Vol. 27, pp 606-610 (1989) describes fourdimensional space. Indeed, that paper describes the use of the BBHKalgorithm for cluster analysis in a four dimensional feature space.Accordingly, the EHK feature can also be applied to high dimensionalfeature spaces.

The value of pixels/voxels could be discrete. For example, in directneutron imaging, fast neutrons are scattered by nuclei. In addition tothat, the nuclei emit gamma radiation that is specific to the type ofnuclide. Such emissions, identify the chemical elements within a voxel.Thus the class of voxel is determined by the chemical element therein.(To simplify, assume that each voxel contains only one element type.)The coordinates of the voxel are determined by detecting the gamma rayradiation from different angles. So now the processor has the positionof the voxel and its information content (element).

To sum up, for raster presentation, a two or three dimensional array iscreated in memory. Physically, the array is one dimensional because ofthe internal structure of computer memory. The values contained in thearray would be class identifiers that represent the class type of thepixel or voxel, and any suitable technique for classification of pixelsor voxels will work with the present invention--which is concerned withanalyzing clusters whose pixels/voxels are (or have been) characterizedby specific class identifiers. Defining pixel/voxel classes depends onthe specific application that would use UEHK processing.

It may be noted that position data for pixels or voxels is not requiredto be explicitly stored in memory. Since the dimensions of thepixel/voxel arrays are known, the position of the pixel can be inferredfrom the pixel/voxel array indices.

Cluster analysis is concerned with neighboring pixels/voxels having thesame (or similar) classification. So if a pixel class is stored inmemory location P and the pixel is not on any of the four boundaries ofthe image, and assuming the memory is linear rather than two or moredimensional, then its nearest neighbors memory locations would be atP-1, P+1, P-R and P+R, where R is the number of pixels in a row. Similarobservations could be made for the six nearest neighbors inthree-dimensions. The specific implementation of this invention is notconfined to using linear style memory. Two dimensional memory arraysthat store image data can be used in a graphical representation. Suchmemory arrays may be configured by a series of interconnected shiftregisters. These have been used in the field of character recognitionand are usable in the present invention also.

In addition to a raster representation, the processor(s) using thisinvention will work with a vector representation of images. In vectorrepresentation, not only is the class of a data point denoted, but alsothe (x,y) or (x,y,z) coordinates of the point in two or threedimensions, respectively are stored in computer memory. Spherical orcylindrical coordinates may also be used, or any other (preferablyorthogonal) coordinate system. So in two dimensions, three quantitiesare needed--the coordinates and the class identifier. In threedimensions, four quantities are needed--the three coordinates and theclass identifier. When using vector representation, for example, twosites are considered to be neighbors if the distance between them isless than some predefined distance r. Other definitions of neighborscould also be used to define clusters.

FIG. 8 demonstrates vector representation images. They do not have alattice structure inherent from the scanning method. However, a grid 280is overlain on the image to reduce computational time for determiningneighbors. The grid contains arbitrary cells. Sites represented by theasterisk symbol (*) 282 are scattered throughout the image and the cellsof the grid. Hence, the vector data can be grouped into cells. For eachcell, the processor can allocate a memory word that points to memorysegments that contain the coordinates and the class identifier for the(x,y) site. FIG. 8 shows how clusters can be identified through drawingcircles 284 of radius r around some of the sites 282. If a site 282falls within a circle 284 centered around another site 282, then the twosites are deemed to be in the same cluster. For each site in a cell, thecomputer need only look in the same cell and eight neighboring cells forcluster members. The cell mechanism allows the processor to limit thescope of the search for cluster members. Also it enables the processorto use the EUHK algorithm because the system will store two rows ofcells at the same time. In three dimensions, the system would need tostore two planes of cells. FIG. 8 is a special case of FIG. 4. In FIG.4, each supernode corresponds to a row of cells in 2-dimensions (or aplane in 3-dimensions) in FIG. 8.

Turning now to the actual processing of the image, a preferredembodiment uses machine instructions stored in memory to analyze the(classified) image data stored in memory. These machine instructionscould be stored in dynamic memory but preferably are stored in anon-volatile memory such as some form of ROM. In the early days ofcomputers, programmers would write machine instruction code tomanipulate data. Unfortunately, complicated machine code is verydifficult to understand. Instead, it is more convenient to representthese machine instructions with pseudo-computer code, as shown inAppendices 1 to 4. This pseudo-code can be easily translated to computerlanguage code such as C++ by a person ordinarily skilled in computerprogramming. The C++ code can be compiled into machine code that can bestored in computer memory shown in FIG. 5 and 6. Such machine code willcause the processor to process the image data and generate an outputthat contains the cluster analysis of the input data. Alternatively, alanguage interpreter that does not create machine code could be usedinstead of a compiler. However, interpreted code is usuallysignificantly slower than compiled code. The pseudo-code and the presentdescription have used generalized HK operators ⊕, and scalar andnon-scalar quantities. Object oriented computer languages, such as C++,allow defining general quantities and operators. Thus, the use of suchquantities is in line with modern computer concepts.

Real Time Analysis of Unbounded Lattices and Parallel Processing

In real time analysis of clusters in two dimensions, one of the latticedimensions (e.g. the width) is bounded, whereas the second dimension(e.g. the length) is unbounded. In this application, as shown in FIG.9(b), assume that the lattice is scanned widthwise, line by line. Thescanned information is stored in a FIFO (First In First Out) scan windowmemory. Only small number of lines will be stored in scan window memory.As lines are stored in memory, they are analyzed in real time by theprocessor(s). This configuration uses a label recycling proceduresimilar to that used in the original HK algorithm. For a singleprocessor operation, one would have to store two rows and two sets oflabels. Cluster parameters would be calculated every time a row iscompleted.

Parallel real time cluster analysis is more challenging than parallelfixed size lattice analysis. If the size of the lattice is known, it canbe partitioned into sections. FIG. 10(a) partitions a fixed size image290 into five sections, 291, 292, 293, 294, 295. Each section isassigned to a respective processor and memory. Each processor wouldanalyze its corresponding one lattice section. Only at sectionboundaries such as 296 would special action be needed for clustersextending from one section to the next. At that point, the HK labelingtechnique is preferably used to stitch the sections together andcomplete the cluster analysis.

For the type of arrangement shown in FIG. 9(b), a fixed lattice ispartitioned into S sections. For purposes of illustration, let thelattice described here contain two classes where one class is abackground class that is not analyzed. This calls for kR(3S-1) bytes forstoring the lattice, k.left brkt-top.R/2.right brkt-top.(3S-1) for thelabels, and r.left brkt-top.R/2.right brkt-top.(3S-1) bytes perF.sup.(n) quantity for a single class analysis. The expression ".leftbrkt-top.x.right brkt-top." rounds a number x up to the next higherinteger value. R, k and r are the row size, the number of bytes perlabel, and a unit of F.sup.(n) quantity, respectively. The factor of 3that appears in these formulas corresponds to the need to store for eachsection--except for the first section--its first row, associated labelsand F.sup.(n) quantities to enable linking each section with itsprevious section.

Thus, for fixed size lattices, the entire lattice can be accessed andsubdivided. On the other hand, for real time operation, it is preferredto access only a small window 297 of the scanned area, as shown in FIG.9(b). Only a small amount of memory, representing the scan window, wouldbe needed for analysis. As the scan window memory used for the analysisbecomes larger, the real time information becomes less timely. Onequestion is how to partition the space for parallel operation. In oneembodiment, each processor would handle a single row. A better approachis to partition the individual rows in the scan window into Sconsecutive processor sections. Again, the multi-labeling technique willallow linking together these sections, once all the row sections arecompleted.

Data structures

In using the present invention, it is preferred that raster or latticetype images will usually be stored in arrays. For example, FIG. 10presents an array 300 that displays the classes of sites for a 5×6lattice. Five classes are identified for this lattice: classes 0, 1, 2,3, 4. These class numbers are shown in FIG. 10. Class 0 is considered tobe background and it is not analyzed for clusters. The other fourclasses are analyzed for clusters.

In FIG. 10, for class 1 for example, there are three clusters if onlythe nearest neighbors are considered for the cluster definition. Thethree class 1 clusters are as follows: a first cluster 301 at site(4,1); a second cluster 302 includes sites (4,3), (4,4), and (5,3); anda third cluster 303 includes just one site (6,2). Likewise, there arethree class 2 clusters, a fist one at sites (2,5) and (2,6); a second at(3,1), (3,2), and (4,2); and a third one at (4,5). FIG. 10 contains fiveclass 3 clusters as follows: a first one 304 at (1,1), (1,2), and (2,1);second at 1,4; a third one at (2,3); a fourth one at (5,1), (6,1); and afifth one at (5,4), (5,5)}. For class 4 there is only one cluster at(1,5). Thus, FIG. 10 is like FIG. 1(a). While FIG. 10 has five classesof date, FIG. 1(a) has just two types (classes) of data--eitherbackground or gray. That is, FIG. 1(a) presents a 6x6 latticerepresented by a 6×6 array. It consists of two classes, gray and whitewhere the white class is considered to be background.

Another lattice array is the site label array that stores labels for thesites. FIGS. 1(b) and 1(c) are such arrays.

FIGS. 11a, b and c concern a vector presentation of an image and, inthis example, the 5×7 image of FIG. 8. The image is represented by two5×7 arrays marked as FIG. 11(a) and FIG. 11(c). FIG. 11(c) gives thecount of number of asterisk (*) symbols in each cell shown in FIG. 8.

The coordinates and the classes of each site are given in FIG. 11(b) bya two-dimensional array 310. There are three columns of this array: thefirst column denotes the x-coordinate of the site; the second columndenotes the y-coordinate of the site; and the third column denotes theclass type of the site. In the example of FIG. 8, there is only oneclass, which is the asterisk. Therefore, if a site contains a clusterfragment, its third column element (marked "class" in FIG. 11(b)) willbe set to "1". Since there are 44 sites in the FIG. 8 image, there willbe 44 rows in array 310.

FIG. 11(a) is a two-dimensional pointer array 312. Each element of array312 points to a row, if the cell is not empty, in array 310 thatcontains the first site of the cell. The other sites of the cell followcontiguously the first cell site. For example, element (1,1) of array312 points to the first row of array 310. The other two sites in thattop left cell (1,1) are represented by rows 2 and 3 of array 310.Element (1,2) in array 312, which represents one site, points to row 4of array 310.

The ability to partition the image into cells allows one to apply theUHK to vector represented images. Similar arrays can be defined in threedimensions.

FIG. 12A displays an example of a 4×6 2-dimensional array 320 for N, thelabeling array. FIG. 12(b) displays a corresponding F.sup.(n) quantitiesarray 322. Array 320 allows for six labels and four classes. Each arrayelement is either positive or negative. For example, data for class 1 isin a leftmost column 324 of FIG. 12(a). In column 324, only two elements(326 and 328 are positive, corresponding to lines 1 and 5. Hence, thereare two clusters with proper labels 1 (i.e., the column 1, first rowsite is positive) and 5 (the first column, fifth row site is positive).Those two clusters with proper labels have populations of 5 and 3,respectively. The other row indices are negative and therefore are notproper labels. They point directly or indirectly to the proper label.

The next column of array 320 is column 330. Because it is the secondcolumn, in this data structure, this concerns class 2 data sites. Asshown in array 320, class 2 has proper labels 332, 334 at row 1 and row2. As can be seen, the indices for the third and fourth rows in thesecond column are negative, and therefore do not indicate proper labels.These two entries point directly or indirectly to a proper label. Forexample, in column 324, the data 336 at (2,1) points to (1,1). The data338 at (3,1) also points to (1,1). The data 340 at (4,1) points to (3,1)which points to (1,1). Class 2 has 2 proper labels 1 and 2 for row 1 androw 2. As can be observed from column 330, not all labels are used. Forclass 3 (the third column) there is only one proper label, and for class4 (fourth column) there is only one proper label.

The F.sup.(n) array 322 shown in FIG. 12B has valid values only wheremarked by the symbol V. In array 322, only array elements that havecorresponding proper label elements in array 320 of FIG. 12A have validF.sup.(n) values. Those that correspond to non-proper labels have onlypartial F.sup.(n) values and should be ignored for most applications.

FIG. 13 represents an alternate and preferred data structure for the Nand F.sup.(n) arrays. This data structure is more economical in terms ofmemory use than those given in FIG. 12A and 12B. FIG. 13 displays twodata types, an array 350 of pointers P and cluster informationstructures R. In this example, the array 350 of pointers is stored inmemory addresses 2000 to 2012 shown on the left of the P array 350. Inthis embodiment, the memory addresses 2000-2012 are used as the labelset for the lattice instead of the integers 1,2,3 . . . , that aregenerated by box 220 of FIG. 2. Each array element of 350 is a pointer,which is a memory reference to another location in memory. For example,as represented by an arrow 352, location 2000, which is the firstelement in array, P(1), points to location 2002 in memory, which is thethird element of the P array, P(3). Array element 2002 points to memorylocation 6000 which holds a data structure R at that location. There maybe numerous R structures. In FIG. 13, however, only two such structures354 and 356 are shown, corresponding to memory addresses 6000 and 6400.Each structure R holds, in contiguous memory locations, the class t ofthe cluster, the cluster population number N (N is now always positiveinteger denoting the cluster population) and the several F(n)quantities. To retrieve the class type t of the cluster with properlabel 2002, the construct P(3)→t is used. The right arrow symbol →denotes that the array element P(3) points to a t component of some Rstructure. Similarly, the population count for the same cluster would begiven by P(3)→N, where N denotes the population count. This approachcould be applied for languages like "C" that support pointer arithmetic.

FIG. 13 represents data structures for just two clusters. The properlabel for the first cluster is 2002. Labels (memory addresses) 2000 and2009 point directly to the proper label 2002, whereas label 2005 pointsindirectly through label 2009 to label 2002. So the cluster labels forthis cluster are {2002, 2000, 2005, 2009} where the first label in thisset, 2002, is the proper label. The other cluster is represented by thelabel set {2007, 2008, 2010, 2012} where 2007 is the proper label.Proper label 2007 points to memory location 6400 which holds the clusterinformation structure R (356) that holds the cluster information for thecluster with proper label 2007.

If the two clusters with the proper labels 2002 and 2007, correspondingto P(3) and P(8), respectively, become merged into proper label 2007,then the combined population count for 2007 would be(P(8)→N)←(P(8)→N)+(P(3)→N)+1. This operation is similar to the arrayoperation for the population count merge in 228a of FIG. 2. The sametechnique would apply to EHK cluster parameters F.sup.(n) such ascluster moments. If F.sup.(1) denotes the first moment for the xcoordinate, then the first x-moment for the merged cluster would be(P(8)→F.sup.(1))←(P(8)→F.sup.(1))+(P(3)→F.sup.(1))+x(i), where i denotesthe site where the cluster merger occurred. Appendix 3 presents avariation on the program of Appendix 1. It shows how a program can bewritten using the data structures of FIG. 13.

While the pointer approach is the most efficient one, many computerlanguages do not support pointers or structures. So the approachpresented in FIG. 13 can be modified to include one-dimensional arraysonly, as shown in FIG. 14. In FIG. 14, N is the traditional labeling andpopulation count array. Its index, shown on the left of the array,denotes the cluster labels. In this example, 13 labels are permitted,but of course more can be used. C is an array that holds the class typefor the cluster label in question. F.sup.(1), F.sup.(2), . . . ,F.sup.(n) are cluster properties. The negative N array elements valuesin FIG. 14 are pointers to other labels, whereas the positive ones,denoted by "cnt," give the population count for the cluster. Appendix 4presents a variation on the program of Appendix 1. It shows how aprogram can be written using the data structures of FIG. 14. The samedata structure variations, which are based on FIG. 13 and FIG. 14, thathave been applied to Appendix 1, can also be applied to Appendix 2 forthe EUHK algorithm.

The following table provides a summary for the basic HK, BBHK, and itsextensions.

    ______________________________________                                        BBHK          EHK       UHK        EUHK                                       ______________________________________                                        analysis                                                                              cluster   general   cluster  general                                          numbers   cluster   number   cluster                                          only      parameters                                                                              only     parameters                               Classes**                                                                             1         M         M        M                                        type of graph                                                                         general   general   lattices lattices                                         graphs    graphs    including                                                                              including                                        including including special  special                                          lattices  lattices  graphs*  graphs*                                  size of graph                                                                         bounded   bounded   unbounded                                                                              unbounded                                ______________________________________                                         *special graphs denote graphs that can be described in terms of linear        supernodes defined in the section titled NonLattice Extension for the EUH     Algorithm, and FIG. 4.                                                        **These numbers do not include the background class. The numbers 1 and M      denote the number of classes that can be analyzed in a single pass throug     the lattice (for BBHK, M classes would require M passes).                

The present invention is not confined to the specific examples presentedhere. It is expected that numerous applications and modifications willbe made without departing from the spirit of the present invention.

    __________________________________________________________________________    Appendix 1                                                                    __________________________________________________________________________    Merge.sub.-- cluster.sup.1                                                    // General Comments                                                           // the // notation defines a comment                                          // the ← arrow defines an assignment. a ← b + 1 means add 1 to      data stored in                                                                //memory location b and store it memory location identified by a. Only        the content                                                                   //of memory location a is modified by this assignment                         //                                                                            //The following is pseudo code for the cluster analysis program using         data structeres                                                               //of Fig. 12. A general representaion of the data is provided by a graph      G.                                                                            // In graph G, every node i is assigned a class T(i) array                    //element.T is a one dimensional array. Its size is equal to the number       of nodes in                                                                   // the graph. We assume here that the T array has been read into memory       // from some I/O device. An example of a class is a grey level. If two        grey levels                                                                   // are considered, say black and white then the number of classes is two      // comments marked by EHK denote new cluster shape EHK features beyond        BBHK                                                                          // Included, but not specifically marked are also new EHK class features      //Intialization of variables                                                  For all classes t Do //t is a variable stored in memory or in CPU             register                                                                      // t identifies a given class. Its values are t=1,2,...tmax. tmax is          // is the number of classes of nodes                                          k(t) ← 0                                                                       // k(t) is a label counter for class t. Its value is incremented              when a                                                                        // a new cluster fragment is found. At this point k(t) is                     initialized                                                                   // to 0. k is an array of size tmax                                     Endloop                                                                       For all nodes i in a G Do                                                     S(i) ← -1                                                                      // mark all nodes in graph G as unvisited, S is a one dimensional             array                                                                   //of the size of the number of nodes in the graph. If it is marked -1         then                                                                          // the node has not been processed yet by the program                         Endloop                                                                       For all nodes i in G Do                                                       t ← T(i) //get class of node i                                           If there are no edges from i to j, or an edge from i to j exists              such that (S(j) = -1 or t ≠ T(j)) // not visited or different           class                                                                         Then //adjacent nodes are not labeled, so we start a new cluster              fragment                                                                      k(t) ← k(t) + 1 //increment label counter for class t                    S(i) ← k(t)  //label the first node of the cluster fragment              N(k(t),t) ← 1 //count this node, N is a two dimensional labeling         array whose                                                                   //size is tmax x graph size                                                   //Do generalized HK operations. F.sup.(n)  represents a generalized           cummulative                                                                   // quantity is could be a scalar or none scalar quantity. The dimensions      of the                                                                        // F arrays is the same as for the N array.                                   // In this section of code we do EHK operations                               F.sup.(1) (k(t),t) ← ƒ.sup.(1) (i) //EHK                                         <1>                                                          F.sup.(2) (k(t),t) ← ƒ.sup.(2) (i) //EHK                                         <2>                                                          ....................                                                          F.sup.(n) (k(t),t) ← ƒ.sup.(n) (i) //EHK                                         <3>                                                          // the following are commented examples of generalized EHK operations         // F.sup.(1) (k(t),t) ← x(i) // initialize value for the first           moment of x (EHK)                                                             // F.sup.(2) (k(t),t) ← x(i).sup.2  //initialize value for the           second moment of x (EHK)                                                      ....................                                                          // F.sup.(n) (k(t),t) ← x(i).sup.n  //initialize value for the n'th      moment of x (EHK)                                                             Else                                                                          For all edges from node i to nodes j such that (t = T(j) and S(j) ≠     -1) Do                                                                        // node i has adjacent nodes j of class t                                     s ← CLASSIFY1(j,t) // identifies the proper label of node j, s is a      label of                                                                                // node j                                                           If j is not first in the inner For loop                                       then                                                                          If q ≠ s then //merge cluster fragments. q is a storage variable        for the                                                                                 // previous cluster fragment label                                  N(s,t) ← N(s,t) + N(q,t)                                                 // Generalized EHK operations                                                 F.sup.(1) (s,t) ← F.sup.(1) (s,t) ⊕.sup.(r.sbsb.1.sup.)              F.sup.(1) (q,t)              //EHK                                                                              <4>                                         F.sup.(2) (s,t) ← F.sup.(2) (s,t) ⊕.sup.(r.sbsb.2.sup.)              F.sup.(2) (q,t)              //EHK                                                                              <5>                                         ....................                                                          F.sup.(n) (s,t) ← F.sup.(n) (s,t) ⊕.sup.(r.sbsb.n.sup.)              F.sup.(n) (q,t)              //EHK                                                                              <6>                                         N(q,t) ← -s //point to the proper cluster label s                        //The following are commented examples of Generalized EHK operations          //these are regular scalar additions for the moments of x                     //F.sup.(1) (s,t) ← F.sup.(1) (s,t) + F.sup.(1) (q,t)                                                //EHK                                             //F.sup.(2) (s,t) ← F.sup.(2) (s,t) + F.sup.(2) (q,t)                                                //EHK                                             ....................                                                          //F.sup.(n) (s,t) ← F.sup.(n) (s,t) + F.sup.(n) (q,t)                                                //EHK                                             Endif // if q = s cluster fragments are already merged, no further            action                                                                                // is needed                                                          Endif                                                                         q ← s // set new node label to old                                       Endloop                                                                       N(s) ← N(s,t) + 1 // add the new node that merged the cluster            fragments                                                                            //to the cluster count                                                 // Do generalized EHK operations for the node that merged the fragments       //F.sup.(1) (s,t) ← F.sup.(1) (s,t) ⊕.sup.(r.sbsb.1.sup.)            ƒ.sup.(1) (i)                                                                        //EHK                                                                             <7>                                                       //F.sup.(2) (s,t) ← F.sup.(2) (s,t) ⊕.sup.(r.sbsb.2.sup.)            ƒ.sup.(2) (i)                                                                        //EHK                                                                             <8>                                                       ....................                                                          //F.sup.(n) (s,t) ← F.sup.(n) (s,t) ⊕.sup.(r.sbsb.n.sup.)            ƒ.sup.(n) (i)                                                                        //EHK                                                                             <9>                                                       // commented generalized EHK operations for the node that merged the          fragments                                                                     //F.sup.(1) (s,t) ← F.sup.(1) (s,t) + x                                                //EHK                                                           //F.sup.(2) (s,t) ← F.sup.(2) (s,t) + x.sup.2                                          //EHK                                                           ....................                                                          // F.sup.(n) (s,t) ← F.sup.(n) (s,t) + x.sup.n                                         //EHK                                                           Endif                                                                         Endloop                                                                       End Merge.sub.-- cluster.sup.1                                                CLASSIFY1(i,t) //identifies the proper cluster label for a given cluster      label i and                                                                   // cluster class t. Returns proper cluster label                              b ← S(i) //assign variable b site i label                                c ← b //assign variable c the value of b                                 c ← -N(c,t) // get new value for c from the label array N                If c < 0 then                                                                 return b //b is a proper label                                                b ← c                                                                    c ← -N(c,t)                                                              if c< 0 then                                                                  return b //b is a proper label                                                Loop until c<0//search for the proper label                                   b ← c                                                                    c ← -N(c,t)                                                              Endloop                                                                       N(S(i),t) ← -b // S(i) label array index points now to the proper        cluster label                                                                 return b //b is a proper label                                                End CLASSIFY1                                                                 __________________________________________________________________________

    __________________________________________________________________________    Appendix 2                                                                    __________________________________________________________________________    // Example of Enhenced Unbounded HK (EUHK)                                    // comments marked by UHK denote UHK features beyond BBHK                     // comments marked by EHK denote EHK shape features beyond BBHK               //This example is concerned with a two dimensional lattice that is            unbounded                                                                     // along its length dimension. In this example the lattice is viewed as a     continuous                                                                    //two dimensional image.                                                      merge.sub.-- cluster2 //example for the EUHK algorithm                        S(2,R) //two dimensional array of size 2 x R, where R is the row size         // S stores the sites' labels for scanned lines, row 1 stores all             // even rows and row 2 all odd rows for the scanned lattice                   T(2,R) //two dimensional array for temporary storage of site classes for      // 2 scanned lines. It has the same dimensions as S. If an element of T       is 0                                                                          // then it belongs to the background class. The following illustration        // show the shape of T and S                                                   ##STR1##                                                                     // odd line                                                                   //The above 2-dimnesional array represents a two row                          //scan window. Scanned even lines are stored in the top                       //row. Odd scanned lines are store in the bottom row                          SET INTERRUPT=OFF //UHK  When INTERRUPT is turned on by an                    // asynchronous process the execution of the program is terminated            // Initialization                                                             For J ← 1 to R // J is an array index that loops on all row pixels        S(1,J) ← 0 // zero row 1                                                 T(1,J) ← 0 // zero row 1                                                End For                                                                       I ← 1 //I is a pointer to either row 1 or 2 of S and T. Its value is     either 1 or 2                                                                 II ← 1 //II is a toggle that is either 1 or -1. It helps setting the     proper values of I                                                            OI ← 1 //OI is the previous line pointer of S and T                      // if I = 1 then OI = 2, if I = 2 then OI = 1                                 IY ← 0 //IY is a counter for the number of actual rows scanned,          since the                                                                     //beginning of the run. It denotes the Y position of a scanned pixel          Do Forever //UHK, do row processing                                           I ← I + II                                                               II ← -II                                                                 IY ← IY + 1                                                              Read scanning device raw data for 1 scanned line                              convert raw data into pixel data class and store in array row I of T          Set initial values for all k(t) label counters to their initial values        for row I                                                                     //t is a class identifier. Each of the rows 1 or 2 has a different ranges     of values for k(t)                                                            For J ← 1 To R //Process row I, J denotes the x position of a pixel      If T(I,J) = 0 Then Continue //This pixel does not require cluster             analysis                                                                      p ← 0; s ← 0; //p and s are label variables                         t ← T(I,J) //get pixel class and store it in variable t                  IF T(I,J - 1) = t Then //check pixel class of the previous pixel on the       //same row                                                                    s ← CLASSIFY1(S(I,J - 1),t) //get its proper label, CLASSIFY1 is         defined in                                                                            // Appendix 1                                                         EndIf                                                                         IF T(OI,J) = t Then // check adjacent pixel class on the previous row         p = CLASSIFY1(S(I,OI),t)                                                      EndIf                                                                         If s = 0 and p = 0 then //No adjecent pixels of class t found                 k(t) ← k(t) + 1 //increment counter because a new cluster fragment       is found                                                                      N(k(t),t) ← 1 //initialize values of label array element                 X(k(t),t) ← J //EHK, initialize first x moment of X array element        XX(k(t),t) ← J.sup.2  //EHK, initialize second x moment of XX array      element                                                                       Y(k(t),t) ← IY //EHK, initialize first y moment of Y array element       YY(k(t),t) ← IY.sup.2  //EHK, initialize second y moment of YY array     element                                                                       else                                                                          if (s ≠ 0 and p = 0)//previous pixel is labeled                         or (s = p) then //previous row adjecent pixel label is the same as            //previous pixel's label                                                      N(s,t) ← N(s,t) + 1                                                      X(s,t) ← X(s,t) + J //EHK, x first moment                                XX(s,t) ← XX(s,t) + J.sup.2  //EHK, x second moment                      Y(s,t) ← Y(s,t) + IY //EHK, y first moment                               YY(s,t) ← YY(s,t) + IY.sup.2  //EHK, y second moment                     else                                                                          if ( p ≠ 0) then //adjecent row pixel is labeled                        if (s ≠ 0)then //previous pixel is labeled                              //merge two subcluster and their parameters                                   N(s,t) ← N(p,t) + N(s,t) + 1                                             X(s,t) ← X(p,t) + X(s,t) + J //EHK, get first moment for x               coordinate                                                                    XX(s,t) ← XX(p,t) + XX(s,t) + J.sup.2  //EHK, get second moment for      x coordinate                                                                  Y(s,t) ← Y(p,t) + Y(s,t) + IY //EHK, get first moment for y              coordinate                                                                    YY(s,t) ← YY(p,t) + YY(s,t) + IY.sup.2  //EHK, get second moment for     y coordinate                                                                  N(p,t) ← -s                                                              else //previous pixel is not labeled or not of the same class as the          current one                                                                   If p.epsilon slash.group(I) then //belongs to the previous row label set      and there is no                                                               //previous adjacent pixel label so we need to create a new proper             cluster                                                                       //label for the current row and point the old label to the new one            k(t) ← k(t) + 1                                                          N(k(t),t) ← N(p,t)                                                       N(p,t) ← -k(t)                                                           p ← k(t)                                                                 EndIf                                                                         N(p,t) ← N(p,t) + 1                                                      X(p,t) ← X(p,t) + J //EHK                                                XX(p,t) ← XX(p,t) + J.sup.2  //EHK                                       Y(p,t) ← Y(p,t) + IY //EHK                                               YY(p,t) ← YY(p,t) + IY.sup.2  //EHK                                      EndIf                                                                         cluster(OI) //analyze all completed clusters up to the previous row           OI ← I                                                                   if INTERRUPT = ON then //UHK, after a raw is completed, we check                        // for interrupts                                                   STOP  //UHK, Stop program execution                                           EndIf //UHK                                                                   End For Loop                                                                  End ForEver //UHK                                                             cluster(I)//This procedure has UHK features that enable real time             analysis of                                                                   //clusters continuously. In this example both cluster populations and         EHK                                                                           //parameters such as cluster center of mass and moment of inertia are         //analyzed. If an abnormal cluster parameters occurs, it alerts the           system                                                                        for all pixel t classes                                                       for all proper cluster labels i ε group(I)                            CG ← (X(i,t).sup.2  + Y(i,t).sup.2)/N(i,t) //EHK, CG is the squared      sum of cluster X                                                              // and Y first moments divided by the cluster population count                MI ← (XX(i,t) + YY(i,t)) - CG //EHK, MI is cluster moment of             inertia                                                                       if N(i,t) > N.sub.max  then //UHK, check for abnormal cluster size            N.sub.max                                                                     if MI > MI.sub.max  then //UHK, EHK, check for abnormal cluster moment        //of inertia MI.sub.max                                                       alert(t,CG,N(i,t),MI) //UHK, EHK alerts system of cluster abnormality         EndIf                                                                         EndIf                                                                         statistics(t,CG,N(i,t),MI) //collect statistics on intermediate cluster       parameters                                                                    End For                                                                       End procedure cluster                                                         group(i)                                                                      finds if a cluster label j belongs to row 1 or 2 of array S                   this is usally done by dividing the label value space in to two ranges.       The                                                                           bottom range is for odd rows and the top range is for even rows               End group                                                                     Stop.sub.-- Program //UHK, This allows an unsynchronous termination of        the program,                                                                  //without this capability the program would run indefinately                  SET INTERRUPT=ON / UHK                                                        End Stop.sub.-- Program //UHK                                                 Alert                                                                         alert is a UHK procedure. It alerts the system of an abnormal cluster         condition                                                                     it can record the abnormal event and even end the scanning and stop the       program                                                                       End Alert                                                                     Statisitcs                                                                    collects cluster statistics for both BBHK and EHK parameters                  End Statistics                                                                __________________________________________________________________________

    __________________________________________________________________________    Appendix 3                                                                    __________________________________________________________________________    Merge.sub.-- cluster3                                                         //The following is pseudo code for the cluster analysis program               //using data structures given in Fig. 13.                                     //A general representaion of the data is provided by a graph G.               //the construct a → b denotes the value of b pointed by a              //In graph G, every node i is assigned a class T(i) array                     //element.T is a one dimensional array. Its size is equal to the number       of nodes in                                                                   // the graph. We assume here that the T array has been read into memory       // from some I/O device.                                                      // comments marked by EHK denote new EHK features beyond BBHK.                //The labels used here are memory address of array prt.sub.-- array.ptr.su    b.-- array is of the size                                                     //of the graph. ptr.sub.-- array is an array of pointers pointing to          structure R instances. Each                                                   //structure R instance hold cluster parameter information as described in     Fig 14.                                                                       // R is defined by the following components: {t,N,F.sup.(1),F.sup.(2),....    ,F.sup.(n) } t is the cluster                                                 //class, N is the population counter and F.sup.(n)  are the generalized       cluster parameters.                                                           //The function address(p) gives the memory address of p. The function         //value(p) gives the value of the object pointed by p.                        // value(p) is either the address of an instance of R or a value of an        element                                                                       // (which is an address) of ptr.sub.-- array. In this pseudo code it is       assumed that instances of R                                                   // can be created through a New operator. It is assumed that when an R        instance looses its                                                           // pointer in the computation, the memory associated with the R instance      is                                                                            // reclaimed through some mechanism of garbage collection.                    //Intialization of variables                                                  ptr ← adress(ptr.sub.-- array(1)) // ptr is a label counter. Its         value is incremented when a                                                   // a new cluster fragment is found. ptr is initialized                        // to the address of the first element of ptr.sub.-- array, ptr.sub.--        array                                                                         // is a one dimensional array of pointers of the size of the graph            For all nodes i in a G Do                                                     S(i) ← -1 // mark all nodes in graph G as unvisited, S is a one          dimensional array                                                             //of the size of the number of nodes in the graph. If it is marked -1         then                                                                          // the node has not been processed yet by the program                         Endloop                                                                       For all nodes i in G Do                                                       t ← T(i) //get node type for node i                                      If there are no edges from i to j, or an edge from i to j exists              such that (S(j) = -1 or t ≠ T(j)) //not visited or different class      Then //adjacent nodes are not labeled, so we start a new cluster              fragment                                                                      S(i) ← ptr //label the first node of the cluster fragment                value(ptr) ← New R //create a new structure instance R for label ptr     nd place                                                                                 //it in ptr.sub.-- array element whose address is ptr              (value(ptr) → N) ← 1 //count this node                            (value(ptr) → t) ← t //EHK, set the cluster class to t            // F.sup.(n)  represents a generalized cummulative                            // quantity it could be a scalar or none scalar quantity.                     // In this section of code we do EHK operations                               (value(ptr) → F.sup.(1)) ← ƒ.sup.(1) (i)                                    <1>HK                                                      (value(ptr) → F.sup.(2)) ← ƒ.sup.(2) (i)                                    <2>HK                                                      ....................                                                          (value(ptr) → F.sup.(n)) ← ƒ.sup.(n) (i)                                    <3>HK                                                      ptr ← ptr + 1 //increment ptr to the next ptr.sub.-- array element       Else                                                                          For all edges from node i to nodes j such that (t = T(j) and S(j) ≠     -1) Do                                                                        s ← CLASSIFY2(j)  // identifies the proper label of node j, s is a       label of                                                                                // node j                                                           If j is not first in the inner For loop                                       then                                                                          If q ≠ s then //merge cluster fragments. q is a storage variable        for the                                                                                // previous cluster fragment label                                   (value(s) → N) ← (value(s) → N) + (value(q) →       N)                                                                            // Generalized EHK operations                                                 (value(s) → F.sup.(1)  ← (value(s) → F.sup.(1))            ⊕.sup.(r.sbsb.1.sup.)  (value(q) → F.sup.(1))                                                       <4>HK                                        (value(s) → F.sup.(2)  ← (value(s) → F.sup.(2))            ⊕.sup.(r.sbsb.2.sup.)  (value(q) → F.sup.(2))                                                       <5>HK                                        ....................                                                          (value(s) ← F.sup.(n)  ← (value(s) → F.sup.(n))              ⊕.sup.(r.sbsb.n.sup.)  (value(q) → F.sup.(1))                                                       <6>HK                                        value(q) ← s // point to the new proper cluster label s                  Endif // if q = s cluster fragments s are already merged, no further          action                                                                        // is needed                                                                  Endif                                                                         q ← s // set new node label to old                                       Endloop                                                                       (value(s) → N) ← (value(s) → N) + 1 // add the new         node that merged the cluster fragments                                        //to the cluster count                                                        // Do generalized EHK operations for the node that merged the fragments       (value(s) → F.sup.(1)) ← (value(s) → F.sup.(1))            ⊕.sup.(r.sbsb.1.sup.)  ƒ.sup.(1) (i)  //EHK                                                   <7>                                              (value(s) → F.sup.(2)) ← (value(s) → F.sup.(2))            ⊕.sup.(r.sbsb.2.sup.)  ƒ.sup.(2) (i)  //EHK                                                   <8>                                              ....................                                                          (value(s) → F.sup.(n)) ← (value(s) → F.sup.(n))            ⊕.sup.(r.sbsb.n.sup.)  ƒ.sup.(n) (i)  //EHK                                                   <9>                                              Endif                                                                         Endloop                                                                       End Merge.sub.-- cluster3                                                     CLASSIFY2(i) //identifies the proper cluster label for a given cluster        label i                                                                       b ← S(i) //assign variable b site i label                                c ← b //assign variable c the value of b                                 c ← value(c) // get a new value for c.                                   If c is not in the label range then //the label range are values of           addresses from                                                                // the first to the last elements of ptr.sub.-- array                         return b //b is a proper label because it points to an instance of R          b ← c                                                                    c ← value(c)                                                             if c is not in the label range then                                           return b // b is a proper label                                               Loop until c is not in the label range //search for the proper label          b ← c                                                                    c ← value(c)                                                             Endloop                                                                       value(S(i)) ← b // S(i) label pointer points now to the cluster          proper label                                                                  return b //b is a proper label                                                End CLASSIFY2                                                                 __________________________________________________________________________

    __________________________________________________________________________    Appendix 4                                                                    __________________________________________________________________________    Merge.sub.-- cluster4                                                         //The following is pseudo code for the cluster analysis program               //using data structures of Fig. 14.                                           //A general representaion of the data is provided by a graph G.               // In graph G, every node i is assigned a class T(i) array                    //element.T is a one imensional array. It size is equal to the number of      nodes in                                                                      // the graph. We assume here that the graph and T arrays have been read       into memory                                                                   // from some I/O device. An example of a class is a grey level. If two        grey levels                                                                   // are considered, say black and white then the number of classes is two      // comments marked by EHK denote new EHK features beyond BBHK                 // Included, but not specifically marked are also new EHK class features      //Intialization of variables                                                  k ← 0 // k a label counter. Its value is incremented when a              // a new cluster fragment is found. At this point k is initialized to 0.      For all nodes i in a G Do                                                     S(i) ← -1 // mark all nodes in graph G as unvisited, S is a one          dimensional array                                                             //of the size of the number of nodes in the graph. If it is marked -1         then                                                                          // the node has not been processed yet by the program                         Endloop                                                                       For all nodes i in G Do // Do cluster analysis                                t ← T(i) //get node type for node i                                      If there are no edges from i to j, or an edge from i to j exists              such that (S(j) = -1 or t ≠ T(j))                                       Then //adjacent nodes are not labeled, so we start a new cluster              fragment                                                                      k ← k + 1 //increment label counter                                      S(i) ← k //label the first node of the cluster fragment                  N(k) ← 1 //count this node, N is a one dimensional array of the size     of the graph                                                                  C(k) ← t //EHK, mark the class of this cluster                           //Do generalized HK operations. F.sup.(n)  represents a generalized           cummulative                                                                   // quantity is could be a scalar or none scalar quantity. The dimensions      of the                                                                        // F.sup.(n)  arrays is the same size as for the N array.                     ƒ.sup.(n) (i) are calculated                                         // quantities for vertice (site) i                                            // In this section of code we do EHK operations                               F.sup.(1) (k) ← ƒ.sup.(1) (i) //EHK                                         <1>                                                               F.sup.(2) (k) ← ƒ.sup.(2) (i) //EHK                                         <2>                                                               ....................                                                          F.sup.(n) (k) ← ƒ.sup.(n) (i) //EHK                                         <3>                                                               // the following are commented examples of generalized EHK operations         //F.sup.(1) (k) ← x(i) // initialize value for the first moment of x     (EHK)                                                                         //F.sup.(2) (k) ← x(i).sup.(2)  // initialize value for the second       moment of x (EHK)                                                             ....................                                                          //F.sup.(n) (k) ← x(i).sup.(n)  //initialize value for the n'th          moment of x (EHK)                                                             Else                                                                          For all edges from node i to nodes j such that (t = T(j) and S(j) ≠     -1) Do                                                                        s ← CLASSIFY(j)  // identifies the proper label of node j, s is a        label of                                                                              // node j                                                             If j is not first in the inner For loop                                       then                                                                          If q ≠ s then //merge partial clusters. q is a storage variable for     the                                                                           // previous cluster fragment label                                            N(s) ← N(s) + N(q)                                                       // Generalized EHK operations                                                 F.sup.(1) (s) ← F.sup.(1) (s) ⊕.sup.(r.sbsb.1.sup.)  F.sup.(1)       (q  //EHK              <4>                                                    F.sup.(2) (s) ← F.sup.(2) (s) ⊕.sup.(r.sbsb.2.sup.)  F.sup.(2)       (q  //EHK              <5>                                                    ....................                                                          F.sup.(n) (s) ← F.sup.(n) (s) ⊕.sup.(r.sbsb.n.sup.)  F.sup.(n)       (q  //EHK              <6>                                                    N(q) ← -s //point to the proper cluster label s                          //The following are commented examples of Generalized EHK operations          //these are regular scalar additions for the moments of x                     //F.sup.(1) (s) ← F.sup.(1) (s) + F.sup.(1) (q)  //EHK                   //F.sup.(2) (s) ← F.sup.(2) (s) + F.sup.(2) (q)  //EHK                   ....................                                                          //F.sup.(n) (s) ← F.sup.(n) (s) + F.sup.(n) (q)  //EHK                   Endif // if q = s cluster fragments are already merged, no further            action                                                                        // is needed                                                                  Endif                                                                         q ← s // set new node label to old                                       Endloop                                                                       N(s) ← N(s) + 1 // add the new node that merged the cluster              fragments                                                                            //to the cluster count                                                 // Do generalized EHK operations for the node that merged the fragments       F.sup.(1) (s) ← F.sup.(1) (s) ⊕.sup.(r.sbsb.1.sup.)  ƒ.s    up.(1) (i)  //EHK     <7>                                                     F.sup.(2) (s) ← F.sup.(2) (s) ⊕.sup.(r.sbsb.2.sup.)  ƒ.s    up.(2) (i)  //EHK     <8>                                                     ....................                                                          F.sup.(n) (s) ← F.sup.(n) (s) ⊕.sup.(r.sbsb.n.sup.)  ƒ.s    up.(n) (i)  //EHK     <9>                                                     // commented generalized EHK operations for the node that merged the          fragments                                                                     F.sup.(1) (s) ← F.sup.(1) (s) + x //EHK                                  F.sup.(2) (s) ← F.sup.(2) (s) + x.sup.2  //EHK                           ....................                                                          F.sup.(n) (s) ← F.sup.(n) (s) + x.sup.n  //EHK                           Endif                                                                         Endloop                                                                       End Merge.sub.-- cluster4                                                     CLASSIFY(i) //identifies the proper cluster label for a given cluster         label i and                                                                   // cluster class t. Returns proper cluster label                              b ← S(i) //assign variable b site i label                                c ← b //assign variable c the value of b                                 c ← -N(c) // get new value for c from the label array N                  If c < 0 then                                                                 return b //b is a proper label                                                b ← c                                                                    c ← -N(c)                                                                if c< 0 then                                                                  return b // b is a proper label                                               Loop until c<0 // search for the proper label                                 b ← c                                                                    c ← -N(c)                                                                Endloop                                                                       N(S(i)) ← -b // S(i) label array index points now to the proper          label                                                                         return b //b is a proper label                                                End CLASSIFY                                                                  __________________________________________________________________________

I claim:
 1. An inspection and analysis system comprising:a sensingsystem for sensing features in an object and providing digital signalsrepresentative thereof; a programmed computer coupled to the sensingsystem, said computer being arranged to perform cluster analysis on datareceived from said sensing system, wherein said computer is programmedto employ an enhanced Hoshen-Kopelman with label reuse for identifyingand labeling clusters concurrently for each of a plurality of datagroups to detect and quantify cluster features in said data groups, andmerge cluster shape feature properties for each of said data groups,thereby to detect and quantify cluster features other than clusterpopulation; and an output device connected to indicate the identifiedand quantified cluster information developed from inspecting andanalyzing the sensed object to a user of the system.
 2. An inspectionand analysis system comprising:a sensing system for sensing features inan object and providing digital signals representative thereof; amulti-processing programmed computer coupled to the sensing system, saidcomputer being arranged to perform cluster analysis on data receivedfrom said sensing system, wherein said computer is programmed to employan unbounded Hoshen-Kopelman algorithm with label reuse to identify andlabel clusters concurrently for each of a plurality of data groups todetect and quantify cluster features in said data groups, and to mergecluster shape feature properties for each of said data groups thereby topermit the substantially continuous inspection and analysis of saidobject; and an output device connected to indicate the identified andquantified cluster information developed from inspecting and analyzingthe sensed object to a user of the system.
 3. A method for extractingcluster shape feature properties of a d-dimensional image that can beunbounded in a first dimension wherein said image consisting of one ormore classes containing sites selected from the group consisting ofpixels and voxels comprising the steps of:(a) inputting an image sectionj of said image; (b) determining class type t of site i in image sectionj; (c) identifying zero or more distinct p proper cluster labels k₁, . .. ,k_(p) of labeled sites adjacent and of the same class type t as sitei using a capability of a basic bounded Hoshen-Kopelman algorithm withlabel reuse for identifying proper labels of clusters; (d) calculatingscalar and non-scalar image m site properties f.sup.(n) (i) for site iwhere n=1,2, . . . ,m; (e) if p=0computing for all n=1,2, . . . ,mF.sup.(n) (k)=f.sup.(n) (i) and C(k)=t wherein F.sup.(n) (k) is acluster shape feature property of a cluster fragment of the samemathematical type as f.sup.(n) (i), C(k) is a cluster class typeidentifier and k is a cluster proper label selected using a basicbounded Hoshen-Kopelman algorithm with label reuse capability forselecting a cluster proper label when p=0; (f) if p>0computing for alln=1,2, . . . ,m

    F.sup.(n) (k)=F.sup.(n) (k.sub.1)⊕.sup.(n) . . . ⊕.sup.(n) (k.sub.p)⊕.sup.(n) f.sup.(n) (i)

wherein ⊕.sup.(n) is binary associative and commutative operator thatoperates on said f.sup.(n) (i) and cluster feature shape propertiesF.sup.(n) (k₁), . . . , F.sup.(n) (k_(p)) of the same mathematical typeas f.sup.(n) (i) and F.sup.(n) (k) and k is a cluster proper labelselected using a capability of the Hoshen-Kopelman algorithm with labelreuse for selecting a cluster proper label when p>0; (g) repeating steps(a), (b), (c), (d), (e) and (f) for all sites in image section j; (h)outputting of all the determined F.sup.(n) (k,) cluster feature shapeproperties fori. each labeled cluster extending up to the image sectionprevious to image section j and ii. each labeled cluster and labeledcluster fragment extending into image section j; (i) stopping ifprocessing of said image is interrupted; (j) stopping if a number ofcontiguous image sections to be processed is defined and if said numberis equal to the number of already processed image sections of saidimage; and (k) repeating steps (a), (b), (c), (d), (e), (f), (h), (i)and (j) for the next image section j of said image.
 4. The methodaccording to claim 3 wherein for n=1⊕.sup.(n) is a scalar additionoperator and if the said image is bounded and consists of two or moreclasses f.sup.(n) (i) is set to 1 and F.sup.(n) (k) is a clusterpopulation count.
 5. The method according to claim 3 wherein forn=1⊕.sup.(n) is a scalar addition operator and if said image isunbounded f.sup.(n) (i) is set to 1 and (k) is a cluster populationcount.
 6. The method according to claim 3 wherein⊕.sup.(n) is a scalaraddition operator for n=1, . . . ,9 f.sup.(n) (i) is set to x_(i), the xcoordinate of site i, and F.sup.(n) (k) is a first cluster moment forn=1 f.sup.(n) (i) is set to y_(i), they coordinate of site i, andF.sup.(n) (k) is a first cluster moment for n=2 f.sup.(n) (i) is set toz_(i), the z coordinate of site i, and F.sup.(n) (k) is a first clustermoment for n=3 f.sup.(n) (i) is set to x_(i) ² and F.sup.(n) (k) is asecond cluster moment for n=4 f.sup.(n) (i) is set to y_(i) ² andF.sup.(n) (k) is a second cluster moment for n=5 f.sup.(n) (i) is set toz_(i) ² and F.sup.(n) (k) is a second cluster moment for n=6 f.sup.(n)(i) is set to x_(i) y_(i) and F.sup.(n) (k) is a second cluster momentfor n=7 f.sup.(n) (i) is set to x_(i) z_(i) and F.sup.(n) (k) is asecond cluster moment for n=8 f.sup.(n) (i) is set to y_(i) z_(i) andF.sup.(n) (k) is a second cluster moment for n=9.
 7. The methodaccording to claim 3 wherein⊕.sup.(n) is a scalar addition operatorf.sup.(n) (i) is set to the product quantity ##EQU10## where x_(r) (i)are the r'th coordinates of the i'th site in d dimensions raised to thepower m(r), so that F.sup.(n) (k) is a cluster moment for n=1.
 8. Themethod of claim 3 whereinfor n=1f.sup.(n) (i) is set to the x_(r) (i)r'th coordinate of the i'th site, F.sup.(n) (k) is a maximum clustercoordinate value for the coordinate x_(r) and ⊕.sup.(n) corresponds topicking the maximum value of two operands to which ⊕.sup.(n) is applied;and for n=2 f.sup.(n) (i) is set to the x_(r) (i) r'th coordinate of thei'th site, F.sup.(n) (k) is a minimum cluster coordinate value for thecoordinate x_(r), and ⊕.sup.(n) corresponds to picking the minimum valueof two operands to which ⊕.sup.(n) is applied.
 9. The method accordingto claim 3 wherein step (h) further comprises outputting a fault alertand stopping cluster shape feature extraction if one or more of saidF.sup.(n) (k) properties is determined to extend beyond a predeterminedtolerance.
 10. The method according to claim 3 wherein for n=1f.sup.(n)(i) set to 1 if all sites that are neighbors of site i belong to thesame class as site i, and otherwise, f.sup.(n) (i) is set to 0,F.sup.(n) (k) is a cluster perimeter sites count and ⊕.sup.(n) is ascalar addition operator.
 11. The method according to claim 3 whereinfor n=1⊕.sup.(n) is a scalar addition operator f.sup.(n) (i) is thenumber of edges from site i to its neighbor sites that have already beenlabeled and of the same class type as site i F.sup.(n) (k) is a clusteredge count of edges between sites belonging to said cluster; and for n=2⊕.sup.(n) is a scalar addition operator f.sup.(n) (i) is the number ofedges from site i to its neighbor sites that that have already beenlabeled and not of the same class type as site i F.sup.(n) (k) is acluster edge count between sites of cluster k to adjacent sites notbelonging to cluster k.
 12. The method according to claim 3 wherein forn=1⊕.sup.(n) is a scalar addition operator f.sup.(n) (i) is set to animage property g(i) for site i and F.sup.(n) k is a cumulative clusterproperty for g(i).
 13. A method for extracting cluster shape featureproperties in a d-dimensional image that can be unbounded in a firstdimension wherein said image consisting of one or more classescontaining sites of one or more classes represented by vectors containedwithin cells comprising the steps of(a) inputting an image section j ofsaid cells of said image; (b) determining class type t of site i in cellq in image section j; (c) identifying zero or more distinct p propercluster labels k₁, . . . ,k_(p) of labeled adjacent sites in adjacentcells and of the same class type t as site i using a capability of abasic bounded Hoshen-Kopelman algorithm with label reuse for identifyingproper labels of clusters; (d) calculating scalar and non-scalar image msite properties f.sup.(n) (i) for site i where n=1,2, . . . ,m; (e) ifp=0computing for all n=1,2, . . . ,m F.sup.(n) (k)=f.sup.(n) (i) andC(k)=twherein F.sup.(n) (k), is a cluster shape feature property of acluster fragment of the same mathematical type as f.sup.(n) (i), C(k) isa cluster class type identifier and k is a cluster proper label selectedusing a basic bounded Hoshen-Kopelman algorithm with label reusecapability for selecting a cluster proper label when p=0; (f) ifp>0computing for all n=1,2, . . . ,m

    F.sup.(n) (k)=F.sup.(n) (k.sub.1)⊕.sup.(n) . . . ⊕.sup.(n) F.sup.(n) (k.sub.p)⊕.sup.(n) f.sup.(n) (i)

wherein ⊕.sup.(n) is binary associative and commutative operator thatoperates on said f.sup.(n) (i) and cluster feature shape propertiesF.sup.(n) (k₁), . . . , F.sup.(n) (_(p)) of the same mathematical typeas f.sup.(n) (i) and F.sup.(n) (k) and k is a cluster proper labelselected using a capability of the Hoshen-Kopelman algorithm with labelreuse for selecting a cluster proper label when p>0; (g) repeating steps(a), (b), (c), (d), (e) and (f) for all sites belonging to all cells inimage section j; (h) outputting of all the determined F.sup.(n) (k)cluster feature shape properties fori. each labeled cluster extending upto the image section previous to image section j and ii. each labeledcluster and labeled cluster fragment extending into image section j; (i)stopping if processing of said image is interrupted; (j) stopping if anumber of contiguous image sections to be processed is defined and ifsaid number is equal to the number of already processed image sectionsof said image; and (k) repeating steps (a), (b), (c), (d), (e), (f),(h), (i) and (j) for the next image section j of said image.
 14. Anapparatus for extracting cluster shape feature properties of ad-dimensional image that can be unbounded in a first dimension whereinsaid image consisting of one or more classes containing sites selectedfrom the group consisting of pixels and voxels comprising:(a) means forinputting an image section j of said image; (b) means for determiningclass type t of site i in image section j; (c) means for identifyingzero or more distinct p proper cluster labels k₁, . . . ,k_(p) oflabeled sites adjacent and of the same class type t as site i using acapability of a basic bounded Hoshen-Kopelman algorithm with label reusefor identifying proper labels of clusters; (d) means for calculatingscalar and non-scalar image m site properties f.sup.(n) (i) for site iwhere n=1,2, . . . ,m; (e) if p=0means for computing for all n=1,2, . .. ,m F.sup.(n) (k)=f.sup.(n) (i) and C(k)=t wherein F.sup.(n) (k) is acluster shape feature property of a cluster fragment of the samemathematical type as f.sup.(n) (i), C(k) is a cluster class typeidentifier and k is a cluster proper label selected using a basicbounded Hoshen-Kopelman algorithm with label reuse capability forselecting a cluster proper label when p=0; (f) if p>0means for computingfor all n=1,2, . . . ,m

    F.sup.(n) (k)=F.sup.(n) (k.sub.1)⊕.sup.(n) . . . ⊕.sup.(n) F.sup.(n) (k.sub.p)⊕.sup.(n) f.sup.(n) (i)

wherein ⊕.sup.(n) is binary associative and commutative operator thatoperates on said f.sup.(n) (i) and cluster feature shape propertiesF.sup.(n) (k₁), . . . ,F.sup.(n) (k_(p)) of the same mathematical typeas f.sup.(n) (i) and F.sup.(n) (k) and k is a cluster proper labelselected using a capability of the Hoshen-Kopelman algorithm with labelreuse for selecting a cluster proper label when p>0; (g) means forrepeating (a), (b), (c), (d), (e) and (f) for all sites in image sectionj; (h) means for outputting of all the determined F.sup.(n) (k) clusterfeature shape properties fori. each labeled cluster extending up to theimage section previous to image section j and ii. each labeled clusterand labeled cluster fragment extending into image section j; (i) meansfor stopping if processing of said image is interrupted; (j) means forstopping if a number of contiguous image sections to be processed isdefined and if said number is equal to the number of already processedimage sections of said image; and (k) means of repeating (a), (b), (c),(d), (e), (f), (h), (i) and (j) for the next image section j of saidimage.
 15. The apparatus according to claim 14 wherein for n=1⊕.sup.(n)is a scalar addition operator and if the said image is bounded andconsists of two or more classes f.sup.(n) (i) is set to 1 and F.sup.(n)(k) is a cluster population count.
 16. The apparatus according to claim14 wherein for n=1.sup.(n) is a scalar addition operator and if saidimage is unbounded f.sup.(n) (i) is set to 1, and F.sup.(n) (k) is acluster population count.
 17. The apparatus according to claim 14wherein⊕.sup.(n) is a scalar addition operator for n=1, . . . ,9f.sup.(n) (i) is set to x_(i), the x coordinate of site i, and F.sup.(n)(k) is a first cluster moment for n=1 f.sup.(n) (i) is set to y_(i),they coordinate of site i, and F.sup.(n) (k) is a first cluster momentfor n=2 f.sup.(n) (i) is set to z_(i), the z coordinate of site i, andF.sup.(n) (k) is a first cluster moment for n=3 f.sup.(n) (i) is set tox_(i) ² and F.sup.(n) (k) is a second cluster moment for n=4 f.sup.(n)(i) is set to y_(i) ² and F.sup.(n) (k) is a second cluster moment forn=5 f.sup.(n) (i) is set to z_(i) ² and F.sup.(n) (k) is a secondcluster moment for n=6 f.sup.(n) (i) is set to x_(i) y_(i) and F.sup.(n)(k) is a second cluster moment for n=7 f.sup.(n) (i) is set to x_(i)z_(i) and F.sup.(n) (k) is a second cluster moment for n=8 f.sup.(n) (i)is set to y_(z) _(i) and F.sup.(n) (k) is a second cluster moment forn=9.
 18. The apparatus according to claim 14 wherein⊕.sup.(n) is ascalar addition operator f.sup.(n) (i) is set to the product quantity##EQU11## where x_(r) (i) are the r'th coordinates of the i'th site in ddimensions raised to the power m(r), so that F.sup.(n) (k) is a clustermoment for n=1.
 19. The apparatus according to claim 14 whereinforn=1f.sup.(n) (i) is set to the x_(r) (i) r'th coordinate of the i'thsite, F.sup.(n) (k) is a maximum cluster coordinate value for thecoordinate x_(r) and ⊕.sup.(n) corresponds to picking the maximum valueof two operands to which ⊕.sup.(n) is applied; and for n=2 f.sup.(n) (i)is set to the x_(r) (i) r'th coordinate of the i'th site, F.sup.(n) (k)is a minimum cluster coordinate value for the coordinate x_(r), and⊕.sup.(n) corresponds to picking the minimum value of two operands towhich ⊕.sup.(n) is applied.
 20. The apparatus according to claim 14wherein said means (h) further includes means for outputting a faultalert and stopping if one or more of said F.sup.(n) (k) properties isdetermined to extend beyond a predetermined tolerance.
 21. The apparatusaccording to claim 14 wherein for n=1f.sup.(n) (i) is set to 1 if allsites that are neighbors of site i belong to the same class as site i,and otherwise, f.sup.(n) (i) is set to 0, F.sup.(n) (k) is a clusterperimeter sites count and ⊕.sup.(n) is a scalar addition operator. 22.The apparatus according to claim 14 wherein for n=1⊕.sup.(n) is a scalaraddition operator f.sup.(n) (i) is set to an image property g(i) forsite i and F.sup.(n) (k) is a cumulative cluster property for g(i).